Method for producing a synthetic gene or other DNA sequence

ABSTRACT

Disclosed herein is a method for synthesizing a desired nucleic acid sequence. The method comprises dividing the desired sequence into a plurality of partially overlapping segments; optimizing the melting temperatures of the overlapping regions of each segment to disfavor hybridization to the overlapping segments which are non-adjacent in the desired sequence; allowing the overlapping regions of single stranded segments which are adjacent to one another in the desired sequence to hybridize to one another under conditions which disfavor hybridization of non-adjacent segments; and filling in, ligating, or repairing the gaps between the overlapping regions, thereby forming a double-stranded DNA with the desired sequence. Also disclosed is a method for preventing errors in the synthesis of the nucleic acid sequence.

CROSS-REFERENCE TO RELATED APPLICATION

[0001] This application claims the benefit of U.S. Provisional PatentApplication No. 60/472,822, filed on May 22, 2003, the disclosure ofwhich is incorporated by reference.

BACKGROUND OF THE INVENTION

[0002] 1. Field of the Invention

[0003] The present application is generally related to the synthesis ofDNA molecules, and more particularly, to the synthesis of a syntheticgene or other DNA sequence.

[0004] 2. Description of the Related Art

[0005] Proteins are an important class of biological molecules that havea wide range of valuable medical, pharmaceutical, industrial, andbiological applications. A gene encodes the information necessary toproduce a protein according to the genetic code by using threenucleotides (one codon or set of codons) for each amino acid in theprotein. An expression vector contains DNA sequences that allowtranscription of the gene into mRNA for translation into a protein.

[0006] It is often desirable to obtain a synthetic DNA which encodes theprotein of interest. DNA can be synthesized accurately in short pieces,say 50 to 80 nucleotides or less. Pieces substantially longer than thisbecome problematic due to cumulative error probability in the synthesisprocess. Most genes are appreciably longer than 50 to 80 nucleotides,usually by hundreds or thousands of nucleotides. Consequently, directsynthesis is not a convenient method for producing large genes.Currently, large synthetic genes with a desired DNA sequence aremanufactured by any one of several methods:

[0007] 1. If the gene does not contain introns it can be synthesized byPCR directly from genomic DNA. This is feasible for genes of bacteria,lower eukaryotes, and many viruses, but nearly all genes of higherorganisms contain introns.

[0008] 2. A related alternative is to PCR the gene from a full-lengthcDNA clone. It is time consuming and tedious to isolate and characterizea full-length clone, and full-length cDNA clones are available for onlya very small fraction of the genes of any higher organism.

[0009] 3. Based on the gene sequence inferred from the genomic DNAsequence, short DNA segments of both strands of a gene can besynthesized with overlapping ends. These segments are allowed to annealand are joined together with DNA ligase. Annealing efficiency andaccuracy at the segment junctions is often poor, resulting in lowyields.

[0010] 4. An approach to reduce this problem is to build the gene up insubsections in a step-wise manner. This remains time-consuming,expensive, tedious, and inefficient, because many reactions must beperformed.

[0011] 5. Based on the gene sequence inferred from the genomic DNAsequence, short overlapping duplex DNA segments of the gene can besynthesized that contain compatible end-proximal restrictionendonuclease sites. Each fragment can be cut with the appropriateenzyme, annealed, and joined with DNA ligase. In addition to thelimitations above, both strands of the gene sequence must be synthesizedand this method is dependent on the placement of appropriate restrictionsites evenly spaced throughout the gene sequence.

[0012] 6. Genes have also been assembled by overlap extension ofpartially overlapping oligonucleotides using DNA-polymerase-catalyzedreactions. The gene is divided into oligonucleotides, each of whichpartially overlaps and is complementary to the adjacentoligonucleotide(s). The oligonucleotides are allowed to anneal and theresulting DNA construct is extended to the full-length double-strandedgene. See, for example, W. P. C. Stemmer et al. “Single-step assembly ofa gene and entire plasmid from large numbers ofoligodeoxyribonucleotides” Gene, 1995, 164, 49-53 and D. E. Casimiro etal. “PCR-based gene synthesis and protein NMR spectroscopy” Structure,1997, 5, 1407-1412, the disclosures of which are incorporated byreference. Designing the oligonucleotides for gene synthesis by thisapproach has recently been automated as described in D. M. Hoover & J.Lubkowski “DNA Works: an automated method for designing oligonucleotidesfor PCR-based gene synthesis” Nucleic Acids Res., 2002, 30:10 e43, thedisclosure of which is incorporated by reference. The method optimizescodon usage, optionally removes DNA hairpins, and uses anearest-neighbor model of DNA melting to achieve homogeneous targetmelting temperatures. The process of removing local DNA hairpins anddimerization from a single DNA oligonucleotide is also referred to bythose skilled in the art as “removing DNA secondary structure.” Themethods described in these references do not globally optimize a meltingtemperature gap between correct hybridizations and incorrecthybridizations, however.

SUMMARY OF THE INVENTION

[0013] The present application provides a method for synthesizing a DNAsequence and the DNA sequences synthesized by the method. A preferredembodiment of the method utilizes the flexibility of the genetic code toachieve melting temperatures that optimize the simultaneous annealing ofmany gene segments in the desired order, facilitating the assembly of alarge strand from many small ones.

[0014] In some embodiments, the likelihood of synthesizing the correctDNA sequence from a mixture of correct and incorrect gene segments isincreased by determining a property of the DNA sequence or fragmentthereof, or of a polypeptide or protein expressed therefrom, or ofanother molecule derived therefrom in order to ascertain the correctnessof the DNA sequence or fragment thereof. Examples of suitable propertiesinclude the DNA sequence and the molecular weight of the polypeptideexpressed therefrom.

[0015] The method in the present application is of practical utility. Anumber of companies currently offer synthetic gene services.Consequently, a method with improved efficiency is valuable. A partiallist of such companies may be generated easily, for example, byperforming an Internet search for “custom gene synthesis,” “syntheticgene services,” or related keywords.

[0016] Those skilled in the art will immediately comprehend myriadapplications to which the disclosed method may be applied, including:(1) creating de novo “designer” proteins; (2) coupling to automatedexpression and crystallization facilities; (3) building DNA sequencespredicted to express novel protein folds for structural proteomics; (4)building other DNA sequences that do not encode proteins, e.g., as RNAstructural templates or DNA nanotechnology components; (5) expressingproteins from a different species in a desired expression vectoraccording to its own codon usage preference; and (6) creating a smallsynthetic genome by specifying its desired protein sequences andregulatory protein binding sites.

[0017] The present application provides a recursive method forsynthesizing a gene of arbitrary size, i.e., a double-stranded DNA(dsDNA) sequence that codes for a desired peptide sequence, possiblywith flanking regulatory and intergenic sequences extending into otherflanking genes. The disclosed method uses sequence degeneracy to achievemelting temperatures that optimize the. simultaneous annealing of manygene segments in the desired order. This optimization is achieved bychoosing bases and codons such that, with high probability, incorrect orwrong hybridizations melt at a low temperature and correct or righthybridizations melt at a high temperature, allowing the construction ofa large synthetic gene quickly and easily.

[0018] The method comprises: (a) hierarchical assembly (b) byhigh-fidelity techniques such as overlap extension using proof-readingDNA polymerase, ligation, cloning, or other methods, (c) withoptimization of the sequences of the component oligonucleotide pieces tofacilitate preferential hybridization to the desired adjacent piece(s)and to disfavor undesired hybridizations between other pieces, forexample, by exploiting the degeneracy of the genetic code or aregulatory region consensus sequence, (d) to achieve a DNA meltingtemperature gap between correct (high melting temperature) and incorrect(low melting temperature) hybridizations, and (e) optionally selectingpieces of DNA likely to have the correct DNA sequence for subsequentassembly steps, (f) so that, with high probability correct assemblieswill form. Thus, the sequence is designed to encode its own correctself-assembly in a signal superimposed on the coding sequence by usingsynonymous codon substitutions in a manner friendly to the expressionvector.

[0019] R. M. Horton et al. “Engineering hybrid genes without the use ofrestriction enzymes: gene splicing by overlap extension” Gene, 1989, 77,61-68, the disclosure of which is incorporated by reference in itsentirety, discloses overlap extension, but does not disclose optimizingoverlap regions to facilitate the assembly of pieces in the desiredorder. In particular, Horton et al. does not disclose the use ofinformatics to encode the self-assembly of the gene by exploitingsequence degeneracy to achieve high melting temperatures for correcthybridizations and low melting temperatures for incorrecthybridizations.

[0020] Accordingly, the present disclosure provides a method forsynthesizing a DNA sequence comprising at least the steps of: (i)dividing the DNA sequence recursively into small pieces of DNA, whereinadjacent pieces comprise overlapping regions; (ii) optimizing thesequences of the pieces of DNA resulting from each recursive division tostrengthen correct hybridizations and to disrupt incorrecthybridizations; (iii) obtaining the optimized small pieces of DNA,wherein the overlapping regions of any adjacent pieces ofsingle-stranded DNA are complementary; (iv) combining the pieces of DNAderived from the division of the next larger piece of DNA; (v) allowingthe pieces of DNA to self-assemble to form a DNA construct comprisingsingle-stranded DNA segments connected by double-stranded overlapregions; (vi) producing the next-larger piece of DNA from the DNAconstruct; and (vii) repeating steps (iv), (v), and (vi) in reverseorder of the recursive division in step (i) to produce the DNA sequence.

[0021] The present disclosure further provides a DNA sequence,synthesized according to a method comprising at least the steps of: (i)dividing the DNA sequence recursively into small pieces of DNA, whereinadjacent pieces comprise overlapping regions; (ii) optimizing thesequences of the pieces of DNA resulting from each recursive division tostrengthen correct hybridizations and to disrupt incorrecthybridizations; (iii) obtaining the optimized small pieces of DNA,wherein the overlapping regions of any adjacent pieces ofsingle-stranded DNA are complementary; (iv) combining the pieces of DNAderived from the division of the next larger piece of DNA; (v) allowingthe pieces of DNA to self-assemble to form a DNA construct comprisingsingle-stranded DNA segments connected by double-stranded overlapregions; (vi) producing the next-larger piece of DNA from the DNAconstruct; and (vii) repeating steps (iv), (v), and (vi) in reverseorder of the recursive division in step (i) to produce the DNA sequence.

[0022] In some embodiments, a next-larger piece of DNA produced in step(vi) comprises a mixture of DNA molecules. Some embodiments furthercomprise a step of selecting a DNA molecule from the mixture likely tohave the correct DNA sequence and using the selected DNA molecule in thesynthesis of the DNA sequence. In some embodiments, a DNA molecule isseparated from the mixture by cloning. In some embodiments, theselection comprises determining a property of selected DNA moleculesfrom the mixture, or of polypeptides expressed therefrom, and selectinga DNA molecule based on a predetermined value for the property. In someembodiments, the selection comprises sequencing a sample of DNAmolecules from the mixture and selecting a DNA molecule with the correctDNA sequence. In some embodiments, the selection comprises expressing apolypeptide from each member of a sample of DNA molecules from themixture, determining the molecular weight of the polypeptide, andselecting a DNA molecule from which a polypeptide with a predeterminedmolecular weight is expressed. In some embodiments, a start codon and/orstop codon is incorporated into DNA molecule from which a polypeptide isexpressed. In some embodiments, the reading frame of the DNA molecule isadjusted with respect to the start codon and/or stop codon. In someembodiments, one or more stop codons is inserted into the expressionvector downstream (3′) from the gene. In some embodiments, the molecularweight of the polypeptide is determined by electrophoresis.

[0023] The present disclosure further provides a method for synthesizinga DNA sequence comprising at least the steps of: (i) dividing the DNAsequence recursively into small pieces of DNA, wherein adjacent piecescomprise overlapping regions; (ii) obtaining the small pieces of DNA,wherein the overlapping regions of any adjacent pieces ofsingle-stranded DNA are complementary; (iii) combining the pieces of DNAderived from the division of the next larger piece of DNA; (iv) allowingthe pieces of DNA to self-assemble to form a DNA construct comprisingsingle-stranded DNA segments connected by double-stranded overlapregions; (v) producing the next-larger piece of DNA from the DNAconstruct; (vi) selecting a next-larger piece of DNA likely to have thecorrect sequence; and (vii) repeating steps (iii), (iv), (v), and (vi)in reverse order of the recursive division in step (i) to produce theDNA sequence.

[0024] The present disclosure further provides a DNA sequence,synthesized according to a method comprising at least the steps of: (i)dividing the DNA sequence recursively into small pieces of DNA, whereinadjacent pieces comprise overlapping regions; (ii) obtaining the smallpieces of DNA, wherein the overlapping regions of any adjacent pieces ofsingle-stranded DNA are complementary; (iii) combining the pieces of DNAderived from the division of the next larger piece of DNA; (iv) allowingthe pieces of DNA to self-assemble to form a DNA construct comprisingsingle-stranded DNA segments connected by double-stranded overlapregions; (v) producing the next-larger piece of DNA from the DNAconstruct; (vi) selecting a next-larger piece of DNA likely to have thecorrect sequence; and (vii) repeating steps (iii), (iv), (v), and (vi)in reverse order of the recursive division in step (i) to produce theDNA sequence.

[0025] In some embodiments, the DNA sequence comprises a regulatorysequence. In other embodiments, the synthetic gene has an intergenicsequence. In other embodiments, the DNA sequence has flanking regulatoryand intergenic sequences extending into other flanking genes. In otherembodiments, the DNA sequence encodes a polypeptide. Preferably, thepolypeptide is a portion of a full-length protein. More preferably, thepolypeptide is a full-length protein. In some embodiments, the DNAsequence is a synthetic genome comprising multiple flanking encodedpolypeptides, their regulatory regions, and intergenic regions.

[0026] In a preferred embodiment, the sequence of the DNA sequence isdivided into small pieces of DNA in a single division. This embodimentis referred to herein as “direct self-assembly.” In another preferredembodiment, the sequence of the synthetic gene is divided into smallpieces of DNA in a plurality of divisions. This embodiment is referredto herein as “recursive assembly” or “hierarchical assembly.” In oneembodiment, the sequence of the synthetic gene is divided into pieces ofDNA of about 1,500 bases long or shorter.

[0027] The small pieces of DNA are preferably about 60 bases long orshorter, more preferably, about 50 bases long or shorter. Preferably,the overlapping regions comprise from about 6 to about 60 base-pairs,more preferably, from about 14 to about 33 base-pairs.

[0028] In a preferred embodiment, optimization is performed bycalculating a melting temperature for the pieces of DNA. Preferably, thelowest correct hybridization melting temperature is higher than thehighest incorrect hybridization melting temperature. Those skilled inthe art will realize that the size of the melting temperature gap isrelated to the annealing conditions such that a narrower gap may requiremore stringent annealing conditions in the reassembly step to providethe requisite level of fidelity. Consequently, the temperature gap hasno minimum value. Practically, the difference between the lowest-meltingcorrect match and the highest melting incorrect match is at least about1° C., more preferably, at least about 4° C., more preferably, at leastabout 8° C., most preferably, at least about 16° C. The wider thetemperature gap, the more robust the self-assembly, thereby permittingthe use of less stringent annealing conditions. Those skilled in the artwill appreciate that optimization may be performed using otherparameters or measures related to hybridization propensity, for example,free energy, enthalpy, entropy, or other arithmetic or algebraiccombinations of such parameters or measures, to achieve the same effectas melting temperature. Indeed, the melting temperature itself is onesuch arithmetic or algebraic combination of such parameters or measures.Consequently, in some embodiments, optimization is performed bycalculating a parameter related to hybridization propensity for thepieces of DNA, for example, free energy, enthalpy, entropy, andarithmetic or algebraic combinations thereof.

[0029] In some embodiments, the pieces of DNA are optimized by permutingsilent codon substitutions, for example for a portion encoding apolypeptide. In some embodiments, the pieces of DNA are optimized bytaking advantage of the degeneracy in the regulatory region consensussequence, for example for a regulatory region. In some embodiments, thepieces of DNA are optimized by adjusting boundary points betweenadjacent pieces of DNA. In some embodiments, the pieces of DNA areoptimized by direct base assignment, for example for an intergenicregion.

[0030] In a preferred embodiment, at least one of the optimized smallpieces of DNA is synthetic. In another preferred embodiment, at leastone of the optimized small pieces of DNA is single-stranded.

[0031] In some embodiments, a single-stranded DNA segment in the DNAconstruct has a length of zero bases. In some embodiments, a singlestranded DNA segment has a length of from about zero bases to about 20bases.

[0032] In a preferred embodiment, the next-larger piece of DNA isproduced by cloning the DNA construct and using cellular machinery.Examples of suitable cloning methods include exonuclease III cloning,topoisomerase cloning, restriction enzyme cloning, and homologousrecombination cloning. In another preferred embodiment, the next-largerpiece of DNA is produced by ligating the DNA construct. In yet antherpreferred embodiment, the next-larger piece of DNA is produced byextending the DNA construct by a reaction using a DNA polymerase.Preferably, the DNA polymerase is a proof-reading DNA polymerase.

[0033] In a preferred embodiment, a 3′ nucleotide in an overlappingregion is G or C. Preferably, both 3′ nucleotides in an overlappingregion are independently G or C. In another preferred embodiment, a 3′nucleotide in an overlapping region is A or T.

[0034] In a preferred embodiment, a DNA polymerase primer is mixed withthe pieces of DNA derived from the division of the next-larger piece ofDNA. In another preferred embodiment, no DNA polymerase primer iscombined with the pieces of DNA derived from the division of thenext-larger piece of DNA.

[0035] In a preferred embodiment, a restriction site is designed into anoverlapping region. In another preferred embodiment, the restrictionsite is digested with a site-specific restriction enzyme.

BRIEF DESCRIPTION OF THE DRAWINGS

[0036]FIG. 1A illustrates an embodiment of the disclosed method forsynthesizing a DNA sequence by overlap extension in which a large pieceof DNA is divided into five medium-sized pieces. FIG. 1B illustrates anembodiment for the synthesis a DNA sequence by overlap extension inwhich a medium-sized piece of DNA is divided into 12 short segments.

[0037]FIG. 2 illustrates an embodiment using a direct self-assembled DNAconstruct, from which a full-length DNA sequence is produced byligation.

[0038]FIG. 3 illustrates an embodiment of the disclosed method forsynthesizing a synthetic gene or piece of DNA.

[0039]FIG. 4A-FIG. 4C illustrate the yields of full-lengtholigonucleotide of length 20 to 250 nt, for coupling efficiencies of99.5%, 99%, and 98%, respectively.

[0040]FIG. 5A schematically illustrates an embodiment comprisingdivision and reassembly of a gene from intermediate fragments. FIG. 5Bschematically illustrates an embodiment comprising the division andreassembly of one of the intermediate fragments into oligonucleotidesthat include leader and trailer sequences. FIG. 5C schematicallyillustrates an embodiment of a leader used in the expression of apolypeptide from an intermediate fragment. FIG. 5D schematicallyillustrates an embodiment of a trailer used in the expression of apolypeptide from an intermediate fragment.

[0041]FIG. 6 illustrates the distribution of melting temperatures forthe initial codon assignment for E. coli threonine deaminase inEXAMPLE 1. (solid) Correct matches between small segment overlaps.(dash-dot) Correct matches between long strand overlaps. (dashed)Incorrect matches between small segments. (dotted) Incorrect matchesbetween long strands.

[0042]FIG. 7 illustrates the distribution of melting temperatures afterthe final codon assignment for E. coli threonine deaminase in EXAMPLE 1.

[0043]FIG. 8 is the final codon assignment for E. coli threoninedeaminase in EXAMPLE 1. The codon assigned to each amino acid in theprotein is shown as “Xn” where “X” is a one-letter amino code and “n” isa codon index.

[0044]FIG. 9 is a key that maps an amino acid code and codon index forE. coli.

[0045]FIG. 10 illustrates the codon usage for the synthetic E. colithreonine deaminase sequence from EXAMPLE 1 vs. the codon usage in theE. coli genome. Each codon is shown as a point in which the x-coordinateis its usage per 1,000 in E. coli and the y-coordinate is its usage per1,000 in threonine deaminase. (x, solid line) Usage in native threoninedeaminase. (o, dashed line) Usage in the synthetic gene.

[0046]FIG. 11A-FIG. 11E provide an overlap map of the synthesized DNAsegments of EXAMPLE 1. Each row corresponds to a short segment ofsingle-stranded DNA that was synthesized directly. The overlaps betweenshort segments are indicated by brackets: [ ].The overlaps between longstrands are indicated by braces: { }.

[0047]FIG. 12A-FIG. 12E illustrate the synthesized oligonucleotidesegments from EXAMPLE 1 with forward and reverse complements indicated.Each row corresponds to a short segment of single-stranded DNA that wassynthesized directly. Overlaps between short segments are indicated byunderlining and numbering. Each underlined region in a forward segmentis the reverse complement of the underlined region having the samenumber, and vice versa. Reverse complement region numbers are primed.

[0048]FIG. 13 is a gel of the products of the first set of overlapextension reactions in EXAMPLE 1.

[0049]FIG. 14 is a gel of the products of the PCR reactions in EXAMPLE1.

[0050]FIG. 15 is a gel of the products of the second overlap extensionreaction in EXAMPLE 1.

[0051]FIG. 16 illustrates the distribution of melting temperatures forthe initial nt for variola DNA polymerase-1 in EXAMPLE 2. (solid)Correct matches segment overlaps. (dash-dot) Correct matches betweenlong strand overlaps. (dashed) Incorrect matches between small segments.(dotted) Incorrect matches between long stands.

[0052]FIG. 17 illustrates the distribution of melting temperatures afterthe final condon assignment for variola DNA polymerase-1 in EXAMPLE 2.

[0053]FIG. 18 is the final codon assignment for variola DNA polymerase-1in EXAMPLE 2. The codon assigned to each amino acid in the protein isshown as “Xn” where “X” is a one-letter amino code and “n” is a codonindex.

[0054]FIG. 19 illustrates the codon usage for the synthetic variola DNApolymerase-sequence from EXAMPLE 2 vs. the codon usage in the E. coligenome. Each as a point in which the x-coordinate is its usage per 1,000in E. coli and the y-coodinates is its usage per 1,000 in threoninedeaminase. (x, solid line) Usage in native threonine deaminase. (o,dashed line) Usage in the synthetic gene.

[0055]FIG. 20 illustrates the distribution of melting temperatures atthe initial codon assignment for variola DNA polymerase-2 in EXAMPLE 2.(solid) Correct matches segment overlaps. (dash-dot) Correct matchesbetween long strand overlaps. (dashed) incorrect matches between smallsegments. (dotted) Incorrect matches between long

[0056]FIG. 21 illustrates the distribution of melting temperatures afterthe final codon assignment for variola DNA polymerase-2 in EXAMPLE 2.

[0057]FIG. 22 is the final codon assignment for variola DNA polymerase-2in EXAMPLE. 2. The codon assigned to each amino acid in the protein isshown as “Xn” where “X” one-letter amino code and “n” is a codon index.

[0058]FIG. 23 illustrates the codon usage for the synthetic variola DNApolymerase-2 sequence from EXAMPLE 2 vs. the codon usage in the E. coligenome. Each codon is shown as a point in which the x-coordinate is itsusage per 1,000 in E. coli and the y-coordinate is its usage per 1,000in threonine deaminase. (x, solid line) Usage in native threoninedeaminase. (o, dashed line) Usage in the synthetic gene.

[0059]FIG. 24A-FIG. 24G are overlap maps for the synthesized DNAsegments for variola polymerase-1 of EXAMPLE 2, including DNA polymeraseleaders (FIG. 24A) and trailers (FIG. 24G). Each row corresponds to ashort segment of single-stranded DNA that was synthesized directly. Theoverlaps between short segments are indicated by brackets: [ ].

[0060]FIG. 25A-FIG. 25G are overlap maps for the synthesized DNAsegments for variola polymerase-2 of EXAMPLE 2, including DNA polymeraseleaders (FIG. 25A) and trailers (FIG. 25G). Each row corresponds to ashort segment of single-stranded DNA that was synthesized directly. Theoverlaps between short segments are indicated by brackets: [ ]

[0061]FIG. 26A-FIG. 26E illustrate the synthesized segments for variolapolymerase-1 from EXAMPLE 2 with forward and reverse complementindicated. Each row corresponds to a short segment of single-strandedDNA that was synthesized directly. Overlaps between short segments areindicated by underlining and numbering. Each underlined region in aforward segment is the reverse complement of the underlined regionhaving the same number, and vice versa. Reverse complement regionnumbers are primed.

[0062]FIG. 27A-FIG. 27E illustrate the synthesized segments for variolapolymerase-2 from EXAMPLE 2 with forward and reverse complementindicated. Each row corresponds to a short segment of single-strandedDNA that was synthesized directly. Overlaps between short segments areindicated by underlining and numbering. Each underlined region in aforward segment is the reverse complement of the underlined regionhaving the same number, and vice versa. Reverse complement regionnumbers are primed.

[0063]FIG. 28 is a gel of fragments 0-4, which make up variola DNApolymerase-1 from EXAMPLE 2.

[0064]FIG. 29 is a gel of fragments 5-9, which make up variola DNApolymerase-2 from EXAMPLE 2.

[0065]FIG. 30 is a gel of Parts I and II of variola DNA polymerase fromEXAMPLE 2.

[0066]FIG. 31 is a gel of the ligation product of Parts I and II ofvariola DNA polymerase from EXAMPLE 2.

[0067]FIG. 32 illustrates the distribution of melting temperatures forthe initial codon assignment for E. coli threonine deaminase in EXAMPLE3. (solid) Correct matches between small segment overlaps. (dashed)Incorrect matches between small segments.

[0068]FIG. 33 illustrates the distribution of melting temperatures afterthe final codon assignment for E. coli threonine deaminase in EXAMPLE 3.

[0069]FIG. 34 is the final codon assignment for E. coli threoninedeaminase in EXAMPLE 3. The codon assigned to each amino acid in theprotein is shown as “Xn” where “X” is a one-letter amino code and “n” isa codon index.

[0070]FIG. 35 illustrates the codon usage for the synthetic E. colithreonine deaminase sequence from EXAMPLE 3 vs. the codon usage in theE. coli genome. Each codon is shown as a point in which the x-coordinateis its usage per 1,000 in E. coli and the y-coordinate is its usage per1,000 in threonine deaminase. (x, solid line) Usage in native threoninedeaminase. (o, dashed line) Usage in the synthetic gene.

[0071]FIG. 36 provides an overlap map for the synthesized DNA segmentsof EXAMPLE 3. Each row corresponds to a short segment of single-strandedDNA that was synthesized directly.

[0072]FIG. 37A-FIG. 37F illustrate the synthesized segments from EXAMPLE3 with forward and reverse complement indicated, including leader (FIG.37A) and trailer (FIG. 37F) primers. Each row corresponds to a shortsegment of single-stranded DNA that was synthesized directly. Overlapsbetween short segments are indicated by underlining and numbering. Eachunderlined region in a forward segment is the reverse complement of theunderlined region having the same number, and vice versa. Reversecomplement region numbers are primed.

[0073]FIG. 38 is a gel of the four Medium-sized pieces synthesized bydirect self-assembly and ligation in EXAMPLE 3.

[0074]FIG. 39 is a gel of the threonine deaminase gene synthesized byligation of the four Medium-sized pieces in EXAMPLE 3.

[0075]FIG. 40A-FIG. 40E illustrate the sequences of the gene leader,three intermediate fragments, and gene trailer for the Ty3 GAG3 ORFsynthesized in EXAMPLE 4.

[0076]FIG. 41 is an agarose gel of the three intermediate fragments ofthe Ty3 GAG3 synthesized in EXAMPLE 4.

[0077]FIG. 42 is an agarose gel of the Ty3 GAG3 gene synthesized inEXAMPLE 4.

[0078]FIG. 43 is an SDS-PAGE gel of the Ty3 Gag3p polyprotein producedin EXAMPLE 4.

[0079]FIG. 44 illustrates the sequence for the Ty3 IN gene synthesizedin EXAMPLE 5.

[0080]FIG. 45A-FIG. 45L provide overlap maps for the leader, tenintermediate fragments, and trailer for the Ty3 IN gene synthesized inEXAMPLE 5.

[0081]FIG. 46A-FIG. 46L illustrate the 50 nt oligonucleotides used inthe synthesis of the intermediate fragments for the Ty3 IN genesynthesized in EXAMPLE 5.

[0082]FIG. 47 is an agarose gel of the ten intermediate fragments of theTy3 IN gene synthesized in EXAMPLE 5.

[0083]FIG. 48 is an agarose gel of a TY3 IN gene synthesized in EXAMPLE5.

[0084]FIG. 49 is an SDS-PAGE gel of a TY3 IN protein synthesized inEXAMPLE 5.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

[0085] As used herein, the term “DNA” includes both single-stranded anddoubled-stranded DNA. The term “piece” may refer to either a real orhypothetical piece of DNA depending on context. A “very large” piece ofDNA is longer than about 1,500 bases, a “large” piece of DNA is about1,500 bases or fewer, a “medium-sized” piece of DNA is about 300 to 350bases or fewer, and a “short” piece of DNA is about 50 to 60 bases orfewer. Short pieces of DNA are also referred to as “oligonucleotides” bythose skilled in the art. It will be appreciated that these numbers areapproximate, however, and may vary with different processes or processvariations. Although descriptions of preferred embodiments of thedisclosed method that follow describe each recursive or hierarchicalstep as involving pieces of DNA of the same size range—for example, inwhich all of the pieces of DNA are very large, large, medium-sized, orshort—one skilled in the art will appreciate a hierarchical step mayinvolve DNA from more than one size range. A particular step may involveboth short and medium-sized pieces of DNA, or even short, medium-sized,and large pieces of DNA.

[0086] A “small” or “short” piece of DNA is a DNA segment that can besynthesized, purchased, or is otherwise readily obtained. The term“segment” is also used herein to mean “small piece.” Those skilled inthe art will understand that the term “synthon” is synonymous with theterms “small piece” and “segment” as used herein, although the term“synthon” is not used herein. The DNA segments used in the Examples thatfollow are synthetic; however, the disclosed method also comprehendsusing DNA segments derived from other sources known in the art, forexample, from natural sources including viruses, bacteria, fungi,plants, or animals; from transformed cells; from tissue cultures; bycloning; or by PCR amplification of a naturally occurring or engineeredsequence. As used herein, a “correct” piece of DNA is a piece of DNAwith the correct or desired nucleotide sequence. An “incorrect” piece isone with an incorrect or undesired nucleotide sequence.

[0087] The disclosed method proceeds by a divide-and-conquer strategy(See Aho et al. The Design and Analysis of Computer AlgorithmsAddison-Wesley; Reading, Mass. 1974, the disclosure of which isincorporated by reference in its entirety). A problem that is too largeto be solved directly is broken recursively into smaller sub-problemsuntil each is small enough to be solved directly, then the smallsub-solutions are combined recursively into a solution to the originalproblem. Here, a particular full-length gene is too long to synthesizedirectly. It is broken recursively into smaller overlapping pieces untileach is small enough to be synthesized. The gene is then reassembled inthe reverse order of the disassembly. The smallest pieces arereassembled into the next larger pieces, which are reassembled into thenext larger pieces, and so on and so forth. A reassembly step may beperformed by overlap extension using a high-fidelity DNA polymerase asillustrated in FIG. 1A and FIG. 1B, or by ligation as illustrated inFIG. 2. Those skilled in the art will realize that ligation may also beused to reassemble a single strand of DNA using a variation of the DNAconstruct illustrated in FIG. 2 in which the pieces of DNA that compriseone strand abut, but all of the pieces of DNA comprising thecomplementary strand do not. Another method for reassembly is cloninginto an expression vector and transformation of an appropriate host.Those skilled in the art will understand that many methods of cloningare compatible with the disclosed method, for example, exonuclease IIIcloning, topoisomerase cloning, restriction enzyme cloning, andhomologous recombination cloning.

[0088] As discussed in greater detail below, a synthetic oligonucleotideis typically a mixture containing the desired oligonucleotide mixed withincorrect oligonucleotides, that is, oligonucleotides that do not havethe desired sequence. As would be apparent to one skilled in the art,synthesizing a gene from such a mixture will likely produce the corrector desired gene in admixture with incorrect genes. One method forsynthesizing only the correct gene is to reassemble the gene from piecesof DNA that are likely to have the correct sequences. Consequently, insome embodiments, during the reassembly process, pieces of DNA areselected that are likely to have the correct sequences for use insubsequent reassembly steps. In some embodiments, the criterion for theselection is a property of a reassembled piece of DNA or a polypeptideencoded by and expressed therefrom. In some embodiments, the property isdetermined from the full-length piece of DNA or polypeptide expressedtherefrom. In some embodiments, the property is determined for thecomplementary strand of DNA or polypeptide expressed therefrom. In someembodiments, the property is determined for a piece of RNA transcribedfrom the piece of DNA.

[0089] Any property indicative of the correctness of the DNA sequence isuseful in the method. The likelihood or probability of selecting acorrect sequence depends on the property that is measured or determined.In some embodiments, the likelihood approaches certainty. An example ofsuch a property is an experimentally determined nucleotide sequence ofthe piece of DNA. Any method known in the art for determining anucleotide sequence may be used, including automated sequencing, manualsequencing, mass spectroscopic sequencing, and the like. In otherembodiments, the property indicates that the piece of DNA probably hasthe correct sequence, but does not confirm the correctness of thesequence. An example of such a property is the molecular weight of apolypeptide encoded by and expressed from the piece of DNA. Any methodknown in the art for determining the molecular weight of a polypeptideor protein may be used, including gel electrophoresis, massspectroscopy, and the like.

[0090] The term “PCR” as used herein in the context of assembling orreassembling DNA is a PCR or overlap extension reaction, preferablyusing a proof-reading DNA polymerase (“proof-reading PCR”). The term“direct self-assembly” as used herein in the context of assembling orreassembling DNA is a copy-free method of producing a DNA construct or aDNA construct produced by the method, comprising assembling a largepiece of DNA from short synthetic segments in a single step. “Copy-free”means that the method lacks a copy step, such as is found in overlapextension or PCR, thus eliminating the copying errors. In a preferredembodiment, adjacent segments on the same strand abut, i.e., form a nickin the strand. Preferably, the nicks in the self-assembly are repairedby in vitro ligation. In another preferred embodiment, the nicks arerepaired in vivo by cellular machinery after cloning.

[0091] The method comprises (a) hierarchical assembly (b) byhigh-fidelity techniques such as overlap extension using proof-readingDNA polymerase, ligation, cloning, or other methods, (c) withoptimization of the sequences in the component oligonucleotides tofacilitate preferential hybridization to the desired adjacent piece(s)and to disfavor undesired hybridizations to other pieces, for example,by exploiting the degeneracy of the genetic code or a regulatory regionconsensus sequence, (d) to achieve a DNA melting temperature gap betweencorrect (high melting temperature) and incorrect (low meltingtemperature) hybridizations, (e) optionally selecting pieces of DNAlikely to have the correct DNA sequence for use in the subsequentassembly steps, (f) so that, with high probability correct assemblieswill form.

[0092] Hierarchical assembly ensures that the complexity at any step,and therefore the possible number of incorrect assemblies, is boundedand manageably small. Overlap extension allows every correctlyhybridized 3′ end to be extended reliably to a complementary copy of theprefix of its match. Using a high-fidelity DNA polymerase reactionensures that the copy, with high probability, is correct. Ligation andcloning achieve the same result as overlap extension while avoiding thesmall but non-zero error rate associated with DNA polymerase reactions.The genetic code degeneracy permits flexibility in silent codonsubstitutions to strengthen correct matches and disrupt incorrect ones.A broad temperature gap between the highest-melting incorrecthybridization and the lowest-melting correct hybridization means that,with high probability, most incorrect ones have melted and most correctones have annealed. Selecting gene fragments likely to be correct meansthat oligonucleotide sequence errors arising from chemical synthesiswill be removed or reduced. Consequently, each reassembly reaction, withhigh probability, produces a correct larger piece of DNA. Errors willoccur, of course, but with low probability. Consequently, two or morecompensating errors that yield a product of the same molecularweight—i.e., the same band in the final gel—or two or more compensatingdeletions that provide a product of the same reading frame—i.e., thesame or nearly the same encoded amino acid sequence—would correspond toa doubly rare or rarer event.

[0093] In optimizing the base sequences, theoretical meltingtemperatures are calculated for all possible correct and incorrecthybridizations by methods known in the art, for example, using Mfold.Such methods are disclosed, for example, in M. Zuker et al. “Algorithmsand thermodynamics for RNA secondary structure prediction: A practicalguide.” in RNA Biochemistry and Biotechnology, Barciszewsld & Clark,eds.; Kluwer: 1999; D. H. Mathews et al. “Expanded sequence dependenceof thermodynamic parameters provides robust prediction of RNA secondarystructure” J. Mol. Biol., 1999, 288, 910-940; J. Santa-Lucia “A unifiedview of polymer, dumbbell, and oligonucleotide DNA nearest-neighborthermodynamics” Proc. Natl. Acad. Sci. USA, 1998, 95, 1460-1465; thedisclosures all of which are incorporated by reference in theirentireties. The figure of merit is the gap between the lowest-meltingcorrect match and the highest-melting incorrect match. Examples ofincorrect matches include: (a) hairpins, in which a short segment foldsback and hybridizes to itself; (b) dimers, in which a short segment ispartially self-complementary; (c) intersegment mismatches, in which partof one short segment is partially complementary to part of a second; (d)and shifted correct matches, in which a misaligned overlap region ispartially complementary to another region within the same overlap.Accordingly, in some embodiments, optimization comprises calculating amelting temperature for a single piece of DNA, for example, for ahairpin, in some embodiments, optimization comprises calculating amelting temperature for two pieces of DNA, for example, for anintersegment mismatch. In some embodiments, optimization comprisescalculating both types of melting temperatures.

[0094] The melting temperature gap is widened by perturbations to thecodon assignments, including strengthening correct matches by increasingG-C content in the overlaps and disrupting incorrect matches by choosingnon-complementary bases. Codon assignments are varied and the processrepeated until the gap is comfortably wide. This process may beperformed manually or automated. In a preferred embodiment, the searchof possible codon assignments is mapped into an anytime branch and boundalgorithm developed for biological applications, which is described inR. H. Lathrop et al. “Multi-Queue Branch-and-Bound Algorithm for AnytimeOptimal Search with Biological Applications” in Proc. Intl. Conf. onGenome Informatics, Tokyo, Dec. 17-19, 2001 pp. 73-82; in GenomeInformatics 2001 (Genome Informatics Series No. 12), Universal AcademyPress, the disclosure of which is incorporated by reference. Thoseskilled in the art will recognize that other optimization methods couldbe used, e.g., simulated annealing, genetic algorithms, other branch andbound techniques, hill-climbing, Monte Carlo methods, other searchstrategies, and the like. Those skilled in the art will further realizethat optimizing, i.e., weakening, incorrect matches is functionallyequivalent to optimizing, i.e., strengthening, correct matches, and viceversa. Consequently, suitable optimization methods include weakeningincorrect matches, strengthening correct matches, and any combinationthereof.

[0095] Those skilled in the art will realize that the size of themelting temperature gap is related to the annealing conditions such thata narrower gap may require more stringent annealing conditions in thereassembly step to provide the requisite level of fidelity.Consequently, the temperature gap has no minimum value. In someembodiments, the temperature gap is greater than 0° C., at least about1° C., at least about 2° C., at least about 3° C., at least about 4° C.,at least about 5° C., at least about 6° C., at least about 7° C., atabout 8° C, at least about 9° C., at least about 10° C., at least about12° C., at least about 14° C., at least about 16° C., at least about 18°C., or at least about 20° C. Those skilled in the art will understandthat, under appropriate annealing conditions, the temperature gap isarbitrarily close to 0° C. Practically, the difference between thelowest-melting correct match and the highest melting incorrect match isat least about 1° C., more preferably, at least about 4° C., morepreferably, at least about 8° C., most preferably, at least about 16° C.The wider the temperature gap, the more robust the self-assembly,thereby permitting the use of less stringent annealing conditions.

[0096] Those skilled in the art will realize that the temperature gapmay be increased by optimizing the division process. For example, thedivision may be optimized through a nested search strategy, describedbelow. Other appropriate search strategies will be apparent to thoseskilled in the art.

[0097] In an outer search, a temperature variable is initialized to ahigh temperature, for example, 80° C., then decremented by, for example,one degree at each outer search step. At each outer search step, aninner search is called to see whether it is possible to divide the geneinto short pieces such that no overlap region has a melting temperaturelower than the current setting of the temperature variable. When theinner search finally succeeds, its corresponding codon assignments arereturned from the outer search.

[0098] The inner search proceeds via a depth-first search through thepossible division points that meet the design constraints, for example,minimum overlap lengths, maximum segment lengths, possible C-G clamps atsegment boundaries, and the like. At each inner search step, the overlapregion resulting from the most recent candidate division point is usedto generate the set of codon assignments that yield the highest selfmelting temperature for that region. If the highest self-meltingtemperature is less than the current setting of the, temperaturevariable the inner search fails. Otherwise it continues to the nextstep.

[0099] While the general DNA melting problem is non-linear because DNAsecondary structure can cause non-local elements of the sequence tohybridize, correct matches at the overlap regions are characterized bylinear hybridizations among purely local bases. Consequently, for agiven subsequence of the gene, linear dynamic programming techniquesbased on base pairs (nearest neighbors) are used to determine the codonassignment to that subsequence that maximizes its melting temperature.

[0100] Thus, the gene is divided into short pieces by finding thedivision points that maximize the melting temperature of thelowest-melting overlap region, while respecting the design constraints,for example, minimum overlap lengths, maximum segment lengths, possibleC-G clamps at segment boundaries, and the like. The codon assignmentscorresponding to the maximal division points are a better starting codonassignment at the beginning of the design process than are the mostcommon codons.

[0101] Those skilled in the art will appreciate that optimization may beperformed using other parameters or measures related to hybridizationpropensity, for example, free energy, enthalpy, entropy, or otherarithmetic or algebraic combinations of such parameters or measures, toachieve the same effect as melting temperature. Melting temperatureitself is one such arithmetic or algebraic combination of suchparameters or measures.

[0102] The disclosed method comprises a design or decomposition processand a synthesis or reassembly process. In a preferred embodiment, thesynthetic gene is designed according to a method illustrated as method300 in FIG. 3. In step 302, the DNA sequence or gene is divided intosmall pieces of DNA or oligonucleotides. In step 304, the small piecesof DNA are optimized. In step 306, the optimized small pieces of DNA areobtained. In step 308, the pieces of DNA derived from one division ofeach piece of DNA are combined. In step 310, the pieces of DNA areallowed to self-assemble into a DNA construct. In step 312, the DNAconstruct is extended to full-duplex DNA. In step 314, a propertyindicative of the likelihood of the correctness of the resulting pieceof DNA is determined, and pieces of DNA that are likely to have thecorrect sequence selected. In step 316, steps 308-314 are repeated inreverse order of the division in step 312 to produce the synthetic gene.

[0103] Step 302. The synthetic gene is divided as follows. If thesynthetic gene is very large, the DNA sequence is optionally dividedinto two or more large pieces of DNA of roughly equal size. Adjacentpieces of DNA preferably overlap by a number of nucleotides appropriateto facilitate reassembly. The extent of overlap depends on factorsincluding the particular base sequence, method of reassembly,temperature, and salt concentration, and may be determined by theskilled artisan without undue experimentation. The adjacent pieces ofDNA are designed for reassembly by any method or combination of methodsknown in the art for joining DNA molecules, for example, by ligation orby overlap extension. In some embodiments, the division is optimized toproduce pieces of DNA that are more likely to assemble into the desiredDNA sequence, as described in greater detail below.

[0104] In a preferred embodiment, each large piece is dsDNA and overlapsthe adjacent large piece by at least the width of a restriction site(typically from about four to about six bases). Preferably, arestriction site that does not appear elsewhere in either piece isengineered into the DNA sequence of each resulting piece by exploitingthe degeneracy of the genetic code as described in greater detail below.Adjacent large pieces are reassembled by cutting with the appropriaterestriction enzyme, annealing the adjacent pieces together, and ligatingthe cut ends together.

[0105] In other embodiments, the large pieces of ssDNA or dsDNA aredesigned such that adjacent pieces of DNA on the same strand abutwithout a gap, as illustrated in FIG. 2. The DNA is reassembled byannealing and ligation. In a preferred embodiment, the DNA issingle-stranded. In some embodiments, the large pieces of DNA are about3000 bases long. In some embodiments, adjacent large pieces of DNAoverlap by about 1500 bases.

[0106] In another preferred embodiment, each large piece is ssDNA ordsDNA and directly abuts the adjacent large piece. In this embodiment,primers are then constructed that overlap the abutting ends of the largepieces by from about 25 to about 30 bases or greater, and the abuttinglarge pieces are reassembled by overlap extension.

[0107] In yet another preferred embodiment, each large piece is dsDNAwith overlapping regions comprising from about 25 to about 30 bases orgreater of complementary ssDNA. In this embodiment, the adjacent largepieces are reassembled by hybridization and ligation. In anotherembodiment, the ssDNA overlaps are designed to leave single-strandedregions flanking the double-stranded overlap region. In this embodiment,the adjacent large pieces are reassembled by overlap extension.

[0108] In still another embodiment, each resulting piece is ssDNA andthe overlapping regions of adjacent resulting pieces comprise from about25 to about 33 bases or greater of complementary single-stranded DNA. Insome preferred embodiments, the overlapping regions comprise about 75bases. In this embodiment, the adjacent resulting pieces are reassembledby overlap extension.

[0109] In certain embodiments, the large or very large pieces of DNA maybe reassembled by cloning using a vector of any type known in the art.Examples of suitable vectors include without limitation, plasmids,cosmids, phagemids, viruses, chromosomes, bacterial artificialchromosomes (BAC), or synthetic chromosomes.

[0110] Embodiments using ligation or cloning are preferred in situationsin which using DNA polymerase is disfavored, for example, where the DNAis greater than about 3 KB to about 5 KB long.

[0111] The method for designing a piece of DNA is described for a largepiece of DNA, but is applicable to pieces of DNA of any size. The pieceof DNA is divided into overlapping short pieces of DNA that are readilyavailable—that is, small pieces or segments. Preferably, each shortpiece is small enough to be synthesized readily. This large piece of DNAcould be the target DNA sequence or a large piece of DNA derived fromthe division of a very large piece of DNA as described above.

[0112] In a preferred embodiment, the large piece of DNA is designed for“direct self-assembly.” In this embodiment, each large piece is dividedinto from about 50 to about 60 overlapping small pieces of from about 50to about 60 bases or fewer. Preferably, the adjacent small pieces of DNAfrom the same strand abut, i.e., hybridize to form a DNA construct withno gaps between the pieces. In this embodiment, the large piece of DNAis preferably reassembled by ligation (“direct self-assembly andligation”). In an embodiment in which adjacent small pieces from thesame strand do not abut, i.e., hybridize to form a DNA construct withsingle-stranded gaps between the double-stranded overlaps, the largepiece of DNA is preferably reassembled by overlap extension. In anotherembodiment, adjacent small pieces from the same strand abut, i.e.,hybridize to form a DNA construct with no single-stranded gaps betweenthe double-stranded overlaps, and the large piece of DNA is reassembledby overlap extension. In another embodiment, a DNA construct with acombination of gaps and no gaps is reassembled by overlap extension. Inanother preferred embodiment, the large piece of DNA is reassembled bycloning in an expression vector. In this embodiment, the ends of the DNAconstruct may have any combination of gaps and no gaps. Preferably, theends of the large piece of DNA are adapted for insertion into anexpression vector, for example, complementary to a restriction site inthe expression vector.

[0113] In another preferred embodiment, the large piece of DNA isdesigned for “recursive assembly” or “hierarchical assembly.” In thisembodiment, a large piece of DNA is divided first into about overlappingmedium-sized pieces of DNA. In some embodiments, the large piece of DNAis divided into about three to about ten medium-sized pieces of DNA,preferably, about five to about 7 pieces. Each medium-sized piece isthen subdivided into overlapping small pieces of DNA, preferably, fromabout six to about 12 pieces. As described above for directself-assembly, the DNA pieces at each level of recursion may be designedfor reassembly by any combination of methods, including ligation,overlap extension, or cloning. In a preferred embodiment, the DNA piecesare reassembled by overlap extension. One skilled in the art willrealize that direct self-assembly is a special case of recursiveassembly in which the large piece of DNA is divided in a single step.

[0114] Step 304. For the resulting pieces of DNA at each level ofrecursion, the sequences are optimized to strengthen correct matches(the overlap regions between adjacent pieces of DNA) and to disruptincorrect matches (all other hybridizations). For example, a DNAsequence in a coding region may be optimized by taking advantage of thegenetic code degeneracy. A DNA sequence in a regulatory region may beoptimized by taking advantage of the degeneracy in the regulatory regionconsensus sequence. A DNA sequence outside a coding or regulatoryregion, i.e., in an intergenic region, may be optimized by direct baseassignment.

[0115] Some embodiments use no or limited sequence optimization. Forexample, changes in a nucleotide sequence can change the secondarystructure of DNA and RNA. Changes in the secondary structure in RNAviral genomes can affect the viability of the viruses. In someembodiments, no sequence optimization is performed. In otherembodiments, selected sequences are optimized as described above andother sequences are not.

[0116] In some embodiments, the division described in step 302 isoptimized to increase the probability that the pieces of DNA willreassemble into the desired DNA sequence. The boundary points betweenadjacent pieces of DNA are adjusted to create or to increase atemperature gap, or to disrupt other incorrect hybridizations, forexample, hairpins.

[0117] Reassembly or synthesis of the synthetic gene is the formalreverse of the division process described in steps 302 and 304.

[0118] Step 306. Obtain the optimized small pieces of DNA. Typically,the small pieces of DNA are synthetic. In a preferred embodiment, thesmall pieces of DNA are single-stranded and overlapping portions ofadjacent pieces are complementary.

[0119] Step 308. Combine the pieces of DNA derived from one division ofeach piece of DNA. In a recursive assembly process, the small piecesderived from each medium-sized piece are combined in this step. In thenext recursive assembly cycle, the resulting medium-sized pieces derivedfrom each large piece are combined in this step, and so on and so forth.In a direct self-assembly process, the small pieces derived from thelarge piece of DNA are combined in this step.

[0120] Step 310. Allow the DNA segments to self-assemble to form a DNAconstruct of ssDNA segments connected by double-stranded overlapregions. In embodiments in which pieces of DNA are double-stranded, thepieces are preferably first denatured. Embodiments using overlapextension to reassemble a piece of DNA have single-stranded gaps betweenthe double-stranded overlap regions. Preferably, the single-strandedgaps are from about zero to about 20 bases long. Embodiments usingligation to reassemble a piece of DNA have single-stranded gaps oflength zero (i.e., no gap, a nick in the DNA) and the double-strandedoverlap regions abut each other. Embodiments using cloning to reassemblea piece of DNA have any combination of gaps and no gaps.

[0121] Step 312. Extend the DNA construct to full-duplex dsDNA. Inembodiments with single-stranded gaps between the double-strandedoverlaps, extension is accomplished using overlap extension, preferably,using a high-fidelity DNA polymerase reaction. In embodiments with nogaps between the double-stranded overlaps, extension is accomplished byligation. In another preferred embodiment, the self-assembled constructis cloned into an expression vector, and the extension to full-duplexdsDNA is performed by the cellular machinery.

[0122] Some embodiments use ssDNA in subsequent steps. ssDNA is producedfrom the dsDNA using any method known in the art, for example, bydenaturing or using nicking enzymes. In some embodiments, the DNA iscloned into a vector that produces ssDNA, for example, bacteriophage M13or a plasmid containing the M13 origin of DNA replication. M13 is knownto roll-off ssDNA into the medium.

[0123] Step 314. In some embodiments, a property indicative of thelikelihood of correctness of the resulting piece of DNA is optionallydetermined as disclosed in greater detail below. Pieces of DNA likely tohave the correct sequence are selected for subsequent reassembly steps.In some embodiments, the property is determined after the synthetic geneis fully reassembled in order to select a synthetic gene likely to havethe correct sequence. In some embodiments, the sequence of the selectedsynthetic gene is confirmed by sequencing.

[0124] Step 316. Repeat, steps 308-314 in reverse order of the divisionin step 312 to produce the synthetic gene. Those skilled in the art willappreciate that the pieces of DNA identified in the division process mayeach be synthesized by a different reassembly method. For example, onepiece may be synthesized by overlap extension, while a second issynthesized by ligation, and these two pieces reassembled by cloninginto an expression vector.

[0125] In a preferred embodiment, the disclosed method takes advantageof the fact that the genetic code is sufficiently degenerate to allowcodons to be assigned so that, with high probability, wronghybridizations melt at lower temperatures and correct hybridizationsmelt at higher temperatures. Consequently, there is an intermediatetemperature range within which, with high probability, the product thatdoes form is mostly correct. Because errors occur with low probability,two or more compensating errors that yield a product with the correctmolecular weight—i.e., the same band in the final gel—or two or morecompensating deletions that yield a product of the same readingframe—i.e., the same or nearly the same encoded amino acidsequence—would correspond to a doubly rare or rarer event.

[0126] The final primers, or intermediate primers, or other flankingsequences may contain sequences that allow the synthesized gene to beinserted into an expression vector using various well-known cloningmethods, for example, restriction enzymes, homologous recombination,exonuclease cloning, or other methods known in the art, allowing one tobuild a large synthetic gene quickly and easily.

[0127] A problem in some embodiments of the disclosed method is that asynthetic oligonucleotide, or small piece of DNA, typically contains amixture of the desired DNA sequence (“full-length oligonucleotide”)contaminated with sequences with internal point deletions. This problemis referred to herein as the “N−1” problem because oligonucleotides witha single point deletion (“N−1 oligonucleotides”) are the most commoncontaminant in a typical chemical synthesis of oligonucleotides.Furthermore, in some embodiments, the N−1 oligonucleotides are the mostproblematic because they are more likely to hybridize, and consequently,to provide undesired products, than oligonucleotides with more than onepoint deletion or mutation. When this mixture of oligonucleotides isused to synthesize medium and large pieces of DNA as disclosed herein,the product pieces of DNA contain a population containing DNA with thedesired sequence as well as DNA with errors arising from incorporationof the N−1 oligonucleotides. The N−1 oligonucleotide errors arecumulative and may cause frame-shift mutations, as understood by thoseskilled in the art.

[0128] In the chemically synthesized oligonucleotides, the typicalcoupling efficiency for each nucleotide is from about 98% to about99.5%, or greater. TABLE I provides the yield of the desired full-lengtholigonucleotide of length 20 to 250 nt for coupling efficiencies of99.5%, 99%, and 98%. These results are provided graphically in FIG.4A-FIG. 4C, respectively. As expected, the probability of synthesizing acorrect oligonucleotide decreases with oligonucleotide length andcoupling efficiency. Because each of the oligonucleotide pieces used inthe construction of the synthetic gene contains some N−1 contaminant,the probability of synthesizing the desired synthetic gene decreaseswith the length of the synthetic gene. TABLE I Coupling EfficiencyOligonucleotide Length (nt) 99.5% 99% 98% 20 90.916 82.617 68.123 2588.665 78.568 61.578 30 86.471 74.717 55.662 35 84.311 71.055 50.314 4082.243 67.573 45.480 45 80.208 64.261 41.11 50 78.222 61.112 37.16 5576.286 58.117 33.59 60 74.398 55.268 30.363 65 72.557 52.56 27.445 7070.761 49.984 24.808 75 69.009 47.534 22.425 80 67.301 45.204 20.27 8565.635 42.989 18.323 90 64.011 40.882 16.562 95 62.427 38.878 14.971 10060.881 36.973 13.533 110 57.905 33.438 11.057 115 56.472 31.799 9.995120 55.074 30.240 9.034 130 52.381 27.239 7.382 140 49.821 24.734 6.031150 47.385 22.369 4.928 160 45.068 20.23 4.027 170 42.865 18.296 3.29180 40.769 16.546 2.688 190 38.776 14.964 2.196 200 36.88 13.533 1.795210 35.08 12.24 1.47 220 33.36 11.07 1.19 230 31.73 10.01 0.98 240 30.189.05 0.8 250 28.7 8.19 0.65

[0129] Even in cases in which the desired synthetic gene is synthesizedwith high probability of correct oligonucleotide order, the desired geneis invariably mixed with many defective genes arising from N−1oligonucleotides. In many applications, this mixture of correct anddefective genes is undesirable. Accordingly, disclosed below is a methodfor improving the probability of synthesizing the desired gene and/orselecting the desired gene from this mixture.

[0130] In some embodiments, the N−1 problem is addressed by assemblingthe chemically synthesized oligonucleotides using direct self-assemblyand ligation, as described above and illustrated in FIG. 2. Inembodiments using direct self-assembly and ligation, all of thenucleotides in each oligonucleotide are hybridized, thereby reducing theprobability that an N−1 oligonucleotide will be incorporated in thepreligation DNA construct. In embodiments using overlap extension, apreextension DNA construct incorporating an oligonucleotide with adeletion in a single stranded region is about as likely as a DNAconstruct incorporating a correct oligonucleotide. The single-basedeletion error rate in double-stranded regions is about 0.3%, while theerror rate in single-stranded regions is about 0.5%.

[0131] In some embodiments, the N−1 problem is addressed by sampling thepopulation of synthetic DNA molecules and sequencing the sampledmolecules. In some embodiments, a random sample from the population ofdifferent DNA molecules produced in any of the reassembly steps,including the final step, is sequenced and only those molecules with thecorrect nucleotide sequence are used in the next reassembly step. Theoptimum sample size is related to the probability of synthesizing thedesired DNA molecule. For example, a synthesis of a 200-ntoligonucleotide or intermediate fragment with a 99.5% couplingefficiency provides about 37% of the correct oligonucleotide. Randomlyselecting four oligonucleotides or intermediate fragments from theproduct mixture provides about an 84% chance of selecting at least onecorrect oligonucleotide. For a 300 nt oligonucleotide or intermediatefragment at 99.5% coupling efficiency, the correct oligonucleotide makesup about 22% of the product. The probability of selecting at least onecorrect oligonucleotide or intermediate fragment from a sample of fouroligonucleotides from this mixture is about 63%. The probabilities ofselecting at least one correct oligonucleotide or intermediate fragmentusing sample sizes of 1, 4, 6, and 8 for syntheses with couplingefficiencies of 99.5% and 99.7% and oligonucleotide lengths of 250 nt,300 nt, and 300 nt are provided in TABLE II. As shown in TABLE II, onlya modest amount of sequencing is necessary to provide a good probabilityof selecting a correct oligonucleotide or intermediate fragment. TABLEII Oligonucleotide Coupling Sample Size Length (nt) Efficiency 1 4 6 8200 99.5% 36.7 83.9 93.6 97.4 99.7% 53.7 95.4 99.0 99.7 250 99.5% 28.674.0 86.7 93.2 99.7% 46.0 91.5 97.5 99.3 300 99.5% 22.2 63.4 77.9 86.699.7% 39.4 86.5 95.0 98.2

[0132] In some embodiments, sampling is performed by cloning theDNA-to-be-sequenced into a suitable vector. Typically, each transformedcolony corresponds to one molecule of the synthetic DNA. In someembodiments, a sample of transformed colonies are selected, the DNAsequenced, and DNA with the correct sequence is used in the nexthierarchical stage of assembly. The cloning is any type of cloning knownin the art. In one embodiment, the cloning is topoisomerase I (TOPO®,Invitrogen) cloning.

[0133] The sampling is performed at any of the hierarchical reassemblystages. For example, in some embodiments, an oligonucleotide issequenced after chemical synthesis. In some embodiments,oligonucleotides or intermediate fragments are assembled into amedium-sized piece of DNA, which is then sequenced. In some embodiments,medium-sized pieces of DNA are assembled into a large piece of DNA,which is then sequenced. In some embodiments, the sampling andsequencing are performed on the medium- or large-sized pieces of DNA,which are synthesized by direct self-assembly and ligation.

[0134] In some embodiments, the N−1 problem is addressed by analyzingthe polypeptide(s) expressed from a sample from the population ofsynthetic DNA sequences. The DNA is expressed using any means known inthe art, for example, inserting the gene in an expression vector orusing a cell-free expression system. In some embodiments, the DNAsequence is cloned in an expression vector and expressed. As discussedabove, each clone typically corresponds to one DNA molecule from thepopulation. In some embodiments, the DNA is the full-length syntheticgene. In other embodiments, the DNA is an intermediate fragment. In thecase of an intermediate fragment, those skilled in the art will realizethat, in some embodiments, the intermediate fragment is designed with(1) a leader that provides a start codon in the correct reading frame,that is, provides an ATG in the DNA and a 0-2 nt filler that adjusts thereading frame in order to express the desired polypeptide, and (2) atrailer that provides one or more stop codons (TAA, TAG, or TGA) in theDNA and a 0-2 nt filler that adjusts the reading frame in order toterminate the desired polypeptide. Typically, the reading frame is thesame as for the full-length synthetic gene, although other readingframes are used in some embodiments. Typically, from zero to two basesare inserted into the leader and trailer for adjusting the readingframe. Those skilled in the art will recognize that more than two basescould be used to adjust the reading frame. For example, in someembodiments, the leader and/or trailer encodes additional amino acids,restriction sites, or control sequences. Those skilled in the art willfurther realize that, in some embodiments, different leaders and/ortrailers are used in conjunction with the same piece of DNA in differentsteps of the method. For example, in some embodiments, the leader and/ortrailer used in the expression of a polypeptide from a piece of DNA isdifferent from the leader and/or trailer used in the assembly of thatpiece of DNA. Some embodiments, provide one or more stop codonsdownstream (3′) of the gene in order to stop the translation of DNAfragments constructed from one or more N−1 oligonucleotides. In someembodiments, the stop codons are engineered into the expression vector.Some embodiments include at least three stop codons downstream (3′) ofthe gene, at least one of each in each of the three possible readingframes. Some embodiments use groups of stop codons instead of singlestop codons in each reading frame.

[0135]FIG. 5A-FIG. 5D illustrate an embodiment of the disclosed methodin which a polypeptide is expressed from an intermediate fragment in theconstruction of a synthetic gene. FIG. 5A illustrates schematically thedivision and construction of a gene into a plurality of intermediatefragments.

[0136]FIG. 5B illustrates the division and construction of one of theintermediate fragments. The letters a-g each represents a portion of thesequence of the intermediate fragment. The brackets group these portionsinto oligonucleotides that are purchased or synthesized. “ldr” and “tlr”represent a leader and trailer, respectively. The corresponding portionsof the sequence on the complementary strand are prefixed with a hyphen(-), i.e.,“-ldr,” “-a,” . . . “-g,” and “-tlr.” Again, brackets are usedto indicate the oligonucleotides.

[0137]FIG. 5C is a schematic of leader (ldr) portion illustrated in FIG.5B. From the 5′-end, the leader comprises a10-nt filler, a CATATGrestriction site, and a 0-2-nt filler at the 3′-end. In the illustratedembodiment, the length of the 5′-filler is determined by therequirements of the restriction enzyme. The restriction site is used incloning the intermediate fragment, and includes an ATG start codon. The0-2-nt filler adjusts the reading frame of the intermediate fragmentrelative to the start codon. In some embodiments, the restriction sitedoes not include a start codon. In some embodiments, a start codon isincorporated in the 3′-filler. filler.

[0138]FIG. 5D is a schematic of the trailer (tlr) portion illustrated inFIG. 5B. From the 5′-end, the trailer comprises a 0-2-nt filler, aTAATAA stop sequence, a GGATCC restriction site, and a 5-nt filler. The0-2-nt filler adjusts the reading frame of the stop codon relative tointermediate fragment. TAATAA is a pair of stop codons. Any suitablestop codon is useful. Some embodiments use one stop codon. GGATCC is arestriction site used for cloning the intermediate fragment. The lengthof the 3′-filler is determined by the requirements of the restrictionenzyme. Those skilled in the art will understand that, in otherembodiments, the leader and/or trailer use a different combination offillers, restriction sites, start codon, and/or stop codons. In someembodiments, the intermediate fragment comprises a start and/or stopcodon and the leader and/or trailer does not include the codon. Forexample, in the case of a synthetic gene, the gene typically includesboth a start and stop codon. Similarly, those skilled in the art willunderstand that in some embodiments, the leader and/or trailer does notcomprise a restriction site. Some embodiments of the leader and/ortrailer do not use a 5′- and/or a 3′-filler.

[0139] A polypeptide expressed from a clone with an N−1 defect will bedefective. The expressed peptide is analyzed using any means known inthe art, for example, gel electrophoresis, capillary electrophoresis,two-dimensional electrophoresis, isoelectric focusing, spectroscopy,mass spectroscopy, NMR spectroscopy, chemically, ligand binding,enzymatic cleavage, or a functional or immunological assay. A clone thatexpresses the correct peptide is free from N−1 defects.

[0140] In some embodiments, the expressed polypeptide is analyzed usinggel electrophoresis, which separates polypeptides by molecular weight.Of the 64 possible DNA codons, 3 are stop codons. Consequently, theframe-shift caused by a point deletion is likely to generate a new stopcodon, resulting in a prematurely truncated polypeptide, the molecularweight of which is determined using gel electrophoresis. A clone thatprovides a full-length polypeptide is likely to have the desiredsequence, while one that provides a truncated polypeptide is likely tohave at least one point deletion. In some embodiments, a clone with anN−1 defect or defects produces a polypeptide that is too long, becausethe N−1 defect results in a frame-shift that causes the terminating stopcodon(s) to be ignored (read through). In some embodiments, such apolypeptide that is too long will be terminated by a stop codonengineered into the expression vector downstream (3′) of the gene. Asdiscussed above, some embodiments comprise three groups of stop codons,one group in each possible reading frame. In these embodiments, themolecular weight of the expressed polypeptide is higher than expected.

[0141] In some embodiments, analysis of the expressed polypeptide isused to narrow the sample of clones that are then sequenced. In theseembodiments, the analysis of the expressed polypeptide is used toidentify and to eliminate clearly defective (e.g., truncated or toolong) DNA clones. The remaining clones are then sequenced. In theseembodiments, the expression and analysis is a semi- or nonrandomselection method, in contrast to the random selection method describedabove. In some embodiments, the expressed polypeptide is analyzed by gelelectrophoresis. In some cases, gel electrophoresis does not distinguisha defective polypeptide from the correct polypeptide. For example, insome cases a DNA sequence with an N−1 defect generates a defectivepolypeptide that, to within the resolution of the electrophoresisconditions, has the same molecular weight as the correct polypeptide.This scenario can arise where the defective DNA sequence fortuitouslyexpresses a defective polypeptide similar in molecular weight to thecorrect polypeptide, for example, where the point defect is near the endof the clone. In another scenario, the clone has 3N point deletions thatdo not generate a new stop codon. As discussed above, the defectivepolypeptide is most likely shorter than the correct polypeptide. Adefective polypeptide closer in molecular weight to the correctpolypeptide than the resolution of the electrophoresis experiment is notdistinguished. Given the resolving ability of gel electrophoresis,selecting a correct clone using the method is highly probable. Theprobability is further improved using an analytical technique withhigher resolution, for example, capillary electrophoresis or massspectroscopy. In some cases, all of the clones selected for sequencingin the gene expression screen have the correct sequence, indicating thereliability of this selection method. Furthermore, expressing a gene anddetermining the molecular weight of the expressed polypeptide istypically faster and/or less expensive than the equivalent amount of DNAsequencing. In some embodiments intermediate fragments are selected byestimating the molecular weight of the expressed polypeptide only, andDNA sequencing is reserved only for the final gene construct, and eventhen only after its molecular weight of a polypeptide expressed from thefinal gene has been estimated to be correct.

[0142] In some embodiments, all of the expressed polypeptides that areanalyzed are defective, for example, truncated. In these embodiments, ananalysis of the defective polypeptides indicates the location of thedefect in the DNA sequence. The gene is then resynthesized using thisinformation. In embodiments using multiple hierarchical synthesis steps,only some of the pieces of DNA are resynthesized, for example, anintermediate fragment containing the defect. In some embodiments, theoffending fragment is divided in a different way and/or reoptimized, asdiscussed above. In some embodiments a different clone is chosen toreplace the offending fragment.

[0143] The method described herein provides a quick, easy, andinexpensive method for constructing a synthetic DNA gene that encodesany desired protein or any other desired nucleic acid. A first exampleprovided herein describes a two-step recursive assembly of a geneencoding E. coli threonine deaminase, a protein with 514 amino acidresidues (1,542 coding bases). A second example describes a three-steprecursive assembly of a gene encoding the smallpox (variola) DNApolymerase, a protein with 1,005 amino acid residues (3,015 codingbases). A third example describes a direct self-assembly of a syntheticE. coli threonine deaminase gene, with reassembly by cloning into anexpression vector. A fourth example describes a two-step recursiveassembly with sampling and sequencing of the 876 bp Ty3 GAG3 gene, whichencodes the Gag3p polyprotein. Ty3 is a retrotransposon in Saccharomycescerevisiae. A fifth example describes a two-step recursive assembly withsampling and sequencing for the 1640 bp Ty3 integrase (“Ty3 IN”) gene.Accordingly, it will be appreciated that the method described herein maybe used to construct any desired nucleic acid sequence or any desiredgene.

[0144]E. coli threonine deaminase was chosen arbitrarily because (1) itssize is comparable to most proteins, demonstrating wide applicability,(2) it assembles into a homo-tetramer, demonstrating protein-proteininteractions, (3) its allosteric properties easily can be assessed forcorrect folding and assembly, and (4) in part by whim for old times sakebecause its structure and properties was the Ph.D. thesis project of oneof us. See, Hatfield & Ray “Coupling of slow processes to steady statereactions” J. Biol. Chem., 1970, 245(7), 1753-4; Hatfield, Ray, &Umbarger “Threonine deaminase from Bacillus subtilis. III. Pre-steadystate kinetic properties.” J. Biol. Chem., 1970, 245(7), 1748-53;Hatfield & Umbarger “Threonine deaminase from Bacillus subtilis. II. Thesteady state kinetic properties.” J. Biol. Chem., 1970, 245(7), 1742-7;Hatfield & Umbarger “Threonine deaminase from Bacillus subtilis. I.Purification of the enzyme.” J. Biol. Chem., 1970, 245(7), 173641.

[0145] Smallpox (variola) DNA polymerase was chosen because (1) it islarger than most proteins, demonstrating wide applicability, and (2)current events render it of special interest. In particular, the abilityto synthesize de novo a gene from a pathogenic organism permits studywithout actual use of the pathogen. Examples of such studies includeregulation of gene expression, drug development, vaccine development,and the like. Because the pathogen is never used, there is no chance ofexposure, either accidental or intentional. Moreover, the sequence ofthe synthetic gene may be modified to optimize expression in theselected, non-pathogenic organism.

[0146] Ty3 was chosen because of its resemblance to retroviruses. GAG3encodes Gag3p, a 38 kDa polyprotein that is processed into a 26 kDacapsid and 9 kDa nucleocapsid that assemble into virus-like particles(VLP). Ty3 IN is implicated in the retrovirus-like integration of Ty3 inthe S. cerevisiae genome.

[0147] Preferred embodiments of the disclosed method are illustrated inthe following Examples. In these Examples, the terms “Medium-sizedpiece” and “Intermediate Fragment” are used interchangeably.

EXAMPLE 1 E. coli Threonine Deaminase by Two-step RecursiveDecomposition and Overlap Extension Assembly

[0148] EXAMPLE 1 illustrates the synthesis of an E. coli threoninedeaminase gene by a two-step hierarchical decomposition and reassemblyby overlap extension. E. coli threonine deaminase is a protein with 514amino acid residues (1,542 coding bases).

[0149] Design

[0150] The sequence design method permuted synonymous (silent) codonassignments to each amino acid in the desired protein sequence. Eachsynonymous codon change results in a different artificial gene sequencethat encodes the same protein. Because E. coli was the desiredexpression vector, the initial codon assignment was to pair each aminoacid with its most frequent codon according to E. coli genomic codonusage statistics. Subsequently, the codon assignments were perturbed asdescribed below. The final codon assignment implied a final DNA sequenceto be achieved biochemically.

[0151] In this two-step hierarchical decomposition, the gene was dividedfirst into five overlapping medium-sized pieces (in the present example,not longer than 340 bases, overlap not shorter than 33 bases), then eachmedium-sized piece was divided into several overlapping short segments(in the present example not longer than 50 bases, overlap not shorterthan 18 bases). All overlaps were lengthened if necessary to include aterminal C or G for priming efficiency.

[0152] Theoretical melting temperatures were calculated with Mfold forall possible correct and incorrect hybridizations of the medium-sizedand short pieces of DNA using the most common codons. The results areillustrated in FIG. 6. The gap between the lowest-melting correct matchand the highest-melting incorrect match was increased by perturbing thecodon assignments as described above. Theoretical melting temperaturesfor the optimized sequences are illustrated in FIG. 7. In the presentexample, the gap was at least 10° C.

[0153] The final codon assignment to every amino acid in the threoninedeaminase protein sequence is provided in FIG. 8 (SEQ. ID. NO.: 1). FIG.9 is the codon assignment key for E. coli. FIG. 8 forms the basis forFIG. 7, FIG. 10, and FIG. 11A-FIG. 11D. The initial codon assignment(column 0 in FIG. 9) forms the basis for FIG. 6.

[0154] The design objectives may be achieved without excessive use ofrare codons as illustrated in FIG. 10. The codon usage in the presentexample (open circles and dashed line) is shown as a function of genomiccodon usage in E. coli. The correlation coefficient of codon usage herewith codon usage in E. coli is 0.76. For comparison, the native codonusage in natural threonine deaminase is also shown (x's and solid line).The correlation coefficient of native codon usage with codon usage in E.coli is 0.81. The native usage is better than, but comparable to, thedesigned.

[0155] The resulting DNA sequence was decomposed into the shortoverlapping segments shown in the overlap maps illustrated in FIG.11A-FIG. 11E. The overlaps between short segments are indicated bybrackets: [ ]. The overlaps between long strands are indicated bybraces: { }. In the depiction of strands 1 and 4, three bases precedingthe (−3) are the same bases as the first three bases shown after the(−3) in order to clearly illustrate the overlaps. These bases are notrepeated in the actual DNA sequences. SEQ. ID. NO.: 2-SEQ. ID. NO.: 12correspond to the sequences that comprise strand 0 (FIG. 11A), SEQ. ID.NO.: 13-SEQ. ID. NO.: 24 correspond to the sequences that comprisestrand 1 (FIG. 11B), SEQ. ID. NO.: 25-SEQ. ID. NO.: 36 correspond to thesequences that comprise strand 2 (FIG. 11C), SEQ. ID. NO.: 37-SEQ. ID.NO.: 48 correspond to the sequences that comprise strand 3 (FIG. 1D),and SEQ. ID. NO.: 49-SEQ. ID. NO.: 60 correspond to the sequences thatcomprise strand 4 (FIG. 11E). As is described in greater detail below,each short segment was synthesized directly and assembled into the fivemedium-sized pieces of DNA in five parallel reactions. In a secondstage, the five medium-sized pieces were assembled into the syntheticgene. The segments shown in FIG. 11A-FIG. 11E plus the various primersfor overlap extension totaled 72 synthesized segments with a total of3,093 nucleotides.

[0156] Synthesis

[0157] The target DNA sequence of the synthetic E. coli L-threoninedeaminase gene has 1,542 bases. In the present example, the target DNAsequence was decomposed into a set of five medium-sized pieces, eachmedium-sized piece overlapping the adjacent medium-sized piece by atleast 33 bases. In turn, each medium-sized piece was decomposed into“sets” of 11 or 12 small, single-stranded DNA segments, which overlappedthe adjacent segment by from 18 to 50 bases. The five medium-sizedpieces are designated “Medium-sized piece 0” through “Medium-sized piece4” herein. The small single-stranded DNA segments that make up themedium-sized pieces are designated as “Seg,” medium-sized piece number,and segment number, where the segment number starts at 0 starting at the5′-end of the forward segment. Note that the numbering for both themedium-sized pieces and the segments begins at zero. For example, thefirst segment of Medium-sized piece 0 is “Seg-0-0”; the seventh segmentof Medium-sized piece 4 is “Seg-4-6. ” The segments and primers werecommercially synthesized by Illumina, Inc., San Diego, Calif.

[0158]FIG. 12A-FIG. 12E illustrate the single-stranded DNA segments usedto construct Medium-sized piece 0 through Medium-sized piece 4,respectively. For each segment, the region that overlaps the adjacentcomplementary strand is underlined and numbered. Overlaps identified byprimed numbers are complementary to the corresponding unprimed overlaps.For example, sequence 1 is complementary to sequence 1′ of the adjacentcomplementary strand, while sequence 2 is complementary to sequence 2′,and so forth. SEQ. ID. NO.: 61-SEQ. ID. NO.: 71 correspond to Seg-0-0,Seg-0-1, Seg-0-2 . . . Seg-0-10 illustrated in FIG. 12A, respectively.Similarly, SEQ. ID. NO.: 72-SEQ. ID. NO.: 82 correspond to Seg-1-0through Seg-1-10 (FIG. 12B), SEQ. ID. NO.: 83-SEQ. ID. NO.: 93correspond to Seg-2-0 through Seg-2-10 (FlG. 12C), SEQ. ID. NO.: 94-SEQ.ID. NO.: 105 correspond to Seg-3-0 through Seg-3-11 (FIG. 12D), and SEQ.ID. NO.: 106-SEQ. ID. NO.: 116 correspond to Seg-4-0 through Seg-4-10(FIG. 12E).

[0159] Leader and Trailer Primers

[0160] For Medium-sized piece 0, the first segment, Seg-0-0, serves asthe leader primer for overlap extension. The trailer primer (reversecomplement) is 5′-GTAGCAGTAGGCATCAC-3′ (17-mer, SEQ. ID. NO.: 117). ForMedium-sized piece 1, segment Seg-1-0 serves as the leader primer andthe trailer primer is 5′-GGCTTCAACGGCTATCAC-3′ (18-mer, SEQ. ID. NO.:118). For Medium-sized piece 2, segment Seg-2-0 serves as the leaderprimer and the trailer primer is 5′-GCTTAAGATGTGGGCCAG-3′ (18-mer, SEQ.ID. NO.: 119). For medium-sized piece 3, Seg-3-0 serves as the leaderprimer and Seg-3-11 serves as the trailer primer. For Medium-sized piece4, segment Seg-4-0 serves as the leader primer and the trailer primer is5′-TTAGCCTGCGAGGAAGAAAC-3′ (20-mer, SEQ. ID. NO.: 120). Those skilled inthe art will appreciate that the segments may be designed such that noadded primers are used, as for Medium-sized piece 3, such that one addedprimer is used, as for Medium-sized pieces 0-2, and 4, or such that twoadded primers are used, not illustrated. In particular, if an evennumber of segments is used, the segments may be designed such that noadded primers are needed where the first segment serves as the leaderprimer and the last segment serves as the trailer primer. It will alsobe appreciated that different flanking sequences may be added easily tothese leaders and trailers.

[0161] Assembly of the Five Medium-Sized Pieces

[0162] First Overlap Extension Reactions. The five Medium-sized pieceswere constructed in parallel by overlap extension and PCR from theappropriate set of single-stranded DNA sequences. The reaction mixtureis provided in TABLE III and the thermocycler conditions in TABLE IV.The products of the overlap extension reactions were separated on a 1%agarose gel, shown in FIG. 13. TABLE III Reagent Quantity Roche HighFidelity PCR Master, 25.0 μL Vial #1 (Product # 2140314) Syntheticoligonucleotide mix * 1.0 μL Leader primer, 50 μM 0.5 μL Trailer primer,50 μM 0.5 μL Sterile water, Roche (molecular biology grade) 23.0 μL

[0163] TABLE IV Step Conditions 1. 94° C. for 5 minutes for initialdenaturation, 2. 25 cycles of: 94° C. for 1 minute 45° C. for 30 seconds65° C. for 5 minutes 3. 65° C. for 10 minutes for final extension

[0164] PCR Reaction (enrichment). Each Medium-sized piece was separatelyenriched by PCR using the reaction mixture provided in TABLE V and thethermocycler conditions provided in TABLE VI. The products of the PCRreactions were separated on a 1% agarose gel shown in FIG. 14. TABLE VReagent Quantity Roche High Fidelity PCR Master, 25.0 μL Vial #1(Product # 2140314) 1:50 dilution of first overlap extension reactionproduct 1.0 μL Leader primer, 50 μM 0.5 μL Trailer primer, 50 μM 0.5 μLSterile water, Roche (molecular biology grade) 23.0 μL

[0165] TABLE VI Step Conditions 1. 94° C. for 5 minutes for initialdenaturation, 2. 25 cycles of: 94° C. for 1 minute 45° C. for 30 seconds65° C. for 5 minutes 3. 65° C. for 10 minutes for final extension

[0166] Assembly of the Five Medium-Sized Pieces into a Full Length Gene

[0167] Second Overlap Extension Reaction. The synthetic threoninedeaminase gene was constructed from the five medium-sized pieces fromthe PCR reactions using the reaction mixture provided in TABLE VII andthe thermocycler conditions provided in TABLE VIII. After the reactionwas complete, product was run on a 1.2% agarose gel (FIG. 15) and theband corresponding to the threonine deaminase gene was purified from thegel with GENECLEAN® (BIO101 Systems®, Qbiogene). The purified gene wasfurther purified with phenol-chloroform (1:1) and chloroform-isoamylalcohol (24:1). The aqueous layer was ethanol precipitated withglycogen, and the precipitate resuspended in water and digested withNdeI and BamHI. Those skilled in the art will appreciate that otherrestriction sites could be incorporated in the synthetic gene. TABLE VIIReagent Quantity PCR Reaction Product, Medium-sized piece 0 2.0 μL PCRReaction Product, Medium-sized piece 1 2.0 μL PCR Reaction Product,Medium-sized piece 2 2.0 μL PCR Reaction Product, Medium-sized piece 32.0 μL PCR Reaction Product, Medium-sized piece 4 2.0 μL Leader primer,50 μM ¹ 1.0 μL Trailer primer, 50 μM ² 1.0 μL Sterile water, Roche(molecular biology grade) 13.0 μL Roche High Fidelity PCR Master, 25.0μL Vial #1 (Product # 2140314)

[0168] TABLE VIII Step Conditions 1. 94° C. for 5 minutes for initialdenaturation, 2. 20 cycles of: 94° C. for 1 minute 60° C. for 30 seconds65° C. for 10 minutes 3. 65° C. for 10 minutes for final extension

[0169] Directional cloning is performed on the purified threoninedeaminase gene. pET14b Expression vector (Novagen) is cleaved with BamHIand NdeI generating compatible termini to the threonine deaminase. Thethreonine deaminase insert is ligated into the vector and this is usedto transform into BL21 DE3 electro competent cells.

EXAMPLE 2 Variola DNA Polymerase by Three-Step Recursive Decompositionand Overlap Extension Assembly

[0170] EXAMPLE 2 illustrates the synthesis of a variola DNA polymerasegene by a three-step hierarchical decomposition and reassembly byoverlap extension. variola DNA polymerase is a protein with 1,005 aminoacid residues (3,015 coding bases). variola DNA polymerase is alsoreferred to as “Varpol” and “vpol” herein.

[0171] Design

[0172] Because the variola DNA polymerase gene was intended forexpression in E. coli, codon selection considerations were similar tothose used in the design of E. coli threonine deaminase described inEXAMPLE 1.

[0173] The three-step hierarchical decomposition was performed asfollows. First, the gene was divided into two large pieces of about1,500 bases. The first large piece is designated herein as “Part I” or“polymerase-1.” The second large piece is referred to herein as “PartII” or “polymerase-2. ” Parts I and II were designed with complementaryApaI sites to allow their reassembly by ligation. Second, each largepiece was divided into five overlapping medium-sized pieces(Intermediate Fragments), in the present example, not longer than 340bases, with overlaps not shorter than 33 bases. Third, each medium-sizedpiece was divided into several overlapping short segments, in thepresent example not longer than 50 bases, with overlaps not shorter than18 bases. All overlaps were lengthened if necessary to include aterminal C or G for DNA polymerase priming efficiency.

[0174] The sequences of the DNA pieces were perturbed to provide asuitable gap between the lowest-melting correct match and thehighest-melting wrong match. Theoretical melting temperatures werecalculated as described in EXAMPLE 1. For Part I of the gene, themelting temperatures of the correct and incorrect matches for the smalland medium-sized pieces for the most common E. coli codons are providedin FIG. 16 and for the designed sequence in FIG. 17. The designed DNAsequence is provided in FIG. 18 (SEQ. ID. NO.: 123), which is based onthe codon key in FIG. 9 of EXAMPLE 1. A comparison of the codon usage inthe designed Part I sequence compared to the codon usage in native E.coli threonine deaminase is provided in FIG. 19, illustrating thesimilarity in codon frequency.

[0175] Theoretical melting temperatures for the small and medium-sizedpieces of Part II of the variola DNA polymerase gene are provided inFIG. 20 for the most common codons of E. coli, and for the designedsequences in FIG. 21. The designed sequence for Part II is provided inFIG. 22 (SEQ. ID. NO.: 124), which is based on the codon key in FIG. 9of EXAMPLE 1. A comparison of the codon usage of native E. colithreonine deaminase with the designed sequence of Part II is provided inFIG. 23, illustrating the similarity in codon frequency.

[0176] Overlap maps for the overlapping short segments used to buildeach of the five Intermediate Fragments (medium-sized pieces) 0-4,leaders, and trailers that make up Part I of the variola DNA polymerasegene are provided in FIG. 24A-FIG. 24G. SEQ. ID. NO.: 125 corresponds toleader-0 (FIG. 24A), SEQ. ID. NO.: 126 corresponds to leader-1 (FIG.24A), SEQ. ID. NO.: 127-SEQ. ID. NO.: 138 correspond to strand-0 (FIG.24B), SEQ. ID. NO.: 139-SEQ. ID. NO.: 150 correspond to strand-1 (FIG.24C), SEQ. ID. NO.: 151-SEQ. ID. NO.: 160 correspond to strand-2 (FIG.24D), SEQ. ID. NO.: 161-SEQ. ID. NO.: 170 correspond to strand-3 (FIG.24E), SEQ. ID. NO.: 171-SEQ. ID. NO.: 180 correspond to strand-4 (FIG.24F), SEQ. ID. NO.: 181 corresponds to trailer-0 (FIG. 24G), SEQ. ID.NO.: 182 corresponds to trailer-1 (FIG. 24G). As is described in greaterdetail below, each short segment shown in FIG. 24A-FIG. 24G wassynthesized directly and assembled into the five Intermediate Fragmentsin five parallel reactions, then in a second stage, the fiveIntermediate Fragments were assembled into Part I of the variola DNApolymerase gene.

[0177] Overlap maps for the overlapping short segments used to buildeach of the five Intermediate Fragments 0-4, leaders, and trailers thatmake up Part II of the variola DNA polymerase gene are provided in FIG.25A-FIG. 25G. SEQ. ID. NO.: 183 corresponds to leader-0 (FIG. 25A), SEQ.ID. NO.: 184 corresponds to leader-1 (FIG. 25A), SEQ. ID. NO.: 185-SEQ.ID. NO.: 196 correspond to strand-0 (FIG. 25B), SEQ. ID. NO.: 197-SEQ.ID. NO.: 208 correspond to strand-1 (FIG. 25C), SEQ. ID. NO.: 209-SEQ.ID. NO.: 218 correspond to strand-2 (FIG. 25D), SEQ. ID. NO.: 219-SEQ.ID. NO.: 228 correspond to strand-3 (FIG. 25E), SEQ. ID. NO.: 229-SEQ.ID. NO.: 238 correspond to strand-4 (FIG. 25F), SEQ. ID. NO.: 239corresponds to trailer-0 (FIG. 25G), SEQ. ID. NO.: 240 corresponds totrailer-1 (FIG. 24G). As is described in greater detail below, eachshort segment shown in FIG. 25A-FIG. 25G was synthesized directly andassembled into the five Intermediate Fragments in five parallelreactions, then in a second stage, the five Intermediate Fragments wereassembled into Part II of the variola DNA polymerase gene.

[0178] Biochemistry

[0179] The variola DNA polymerase gene was assembled in the reverseorder of the design process: first, the assembly of the ten IntermediateFragments (five each for Parts I and II of the variola DNA polymerasegene); second, the assembly of the two 1500 bp large pieces, Parts I andII (which combined make up the full length gene); and finally, ApaIdigestion of Part I (in the segment 1Seg-4-09) and Part II (in thesegment 2Seg-2-00) to generate compatible flanking termini, allowing thetwo large pieces to be ligated together to generate the full-lengthvariola DNA polymerase gene.

[0180] variola DNA polymerase Part I. Each of the five IntermediateFragments 0-4 that make up Part I was assembled from syntheticoligonucleotide sets of alternating strand specificity that overlap oneanother. The sequences of the short ssDNA segments used to constructeach of the Intermediate Fragments are provided in FIG. 26A-FIG. 26E.Overlaps between adjacent segments are indicated by underlining andidentifying numbers beneath the underlined regions. Overlaps identifiedwith primed numbers are complementary to the corresponding unprimedoverlaps. SEQ. ID. NO.: 241-SEQ. ID. NO.: 252 correspond to 1Seg-0-00through 1Seg-0-11 (FIG. 26A), SEQ. ID. NO.: 253-SEQ. ID. NO.: 264correspond to 1Seg-1-00 through 1Seg-1-11 (FIG. 26B), SEQ. ID. NO.:265-SEQ. ID. NO.: 274 correspond to 1Seg-2-00 through 1Seg-2-09 (FIG.26C), SEQ. ID. NO.: 275-SEQ. ID: NO.: 284 correspond to 1Seg-3-00through 1Seg-3-09 (FIG. 26D), and SEQ. ID. NO.: 285-SEQ. ID. NO.: 294correspond to 1Seg-4-00 through 1Seg-4-09 (FIG. 26E).

[0181] For each of the Intermediate Fragments, the first segment(1Seg-0-00, 1Seg-1-00, 1Seg-2-00, 1Seg-3-00, and 1Seg-4-00) serves asthe leader primer for overlap extension. The last segment (1Seg-0-11,1Seg-1-11, 1Seg-2-09, 1Seg-3-09, and 1Seg-4-09) serves as the trailerprimer.

[0182] variola DNA polymerase Part II. Each of the five IntermediateFragments 0-4 that make up Part II was assembled from syntheticoligonucleotide sets of alternating strand specificity that overlap oneanother. The sequences of the short ssDNA segments used to constructeach of the Intermediate Fragments are provided in FIG. 27A-FIG. 27E.Overlaps between adjacent segments are indicated by underlining andidentifying numbers beneath the underlined regions. Overlaps identifiedwith primed numbers are complementary to the corresponding unprimedoverlaps. SEQ. ID. NO.: 295-SEQ. ID. NO.: 306 correspond to 2Seg-0-00through 2Seg-0-11 (FIG. 27A), SEQ. ID. NO.: 307-SEQ. ID. NO.: 318correspond to 2Seg-1-00 through 2Seg-1-11 (FIG. 27B), SEQ. ID. NO.:319-SEQ. ID. NO.: 328 correspond to 2Seg-2-00 through 2Seg-2-09 (FIG.27C), SEQ. ID. NO.: 329-SEQ. ID. NO.: 338 correspond to 2Seg-3-00through 2Seg-3-09 (FIG. 27D), and SEQ. ID. NO.: 339-SEQ. ID. NO.: 348correspond to 2Seg-4-00 through 2Seg-4-09 (FIG. 27E).

[0183] For each of the Intermediate Fragments, the first segment(2Seg-0-00, 2Seg-1-00, 2Seg-2-00, 2Seg-3-00, and 2Seg-4-00) serves asthe leader primer for overlap extension. The last segment (2Seg-0-11,2Seg-1-11, 2Seg-2-09, 2Seg-3-09, and 2Seg-4-09) serves as the trailerprimer.

[0184] Assembly of the Five Intermediate Fragments into Large PiecesPart I and Part II

[0185] Each Intermediate Fragment was separately constructed in a firstoverlap extension reaction from the appropriate set of ssDNA sequencesprovided in FIG. 26A-FIG. 26E and FIG. 27A-FIG. 27E. The reactionmixture for each reaction is provided in TABLE IX and the thermocyclerprogram in TABLE X. The “synthetic oligonucleotide mix” entry in TABLEIX is a 27.5 ng/μL mixture of equal amounts of each of the forward andreverse complement synthetic oligonucleotides for a particularIntermediate Fragment. TABLE IX Reagent Quantity Roche High Fidelity PCRMaster, 25.0 μL Vial #1 (Product # 2140314) Synthetic oligonucleotidemix * 1.0 μL Leader primer, 50 μM 0.5 μL Trailer primer, 50 μM 0.5 μLSterile water, Roche (molecular biology grade) 23.0 μL

[0186] TABLE X Step Conditions 1. 94° C. for 5 minutes for initialdenaturation, 2. 20 cycles of: 94° C. for 1 minute 50° C. for 30 seconds65° C. for 5 minutes 3. 65° C. for 10 minutes for final extension

[0187] The products of these reactions were separated on 1% agarose gelsas shown in FIG. 28 for Part I of the variola DNA polymerase gene andFIG. 29 for Part II.

[0188] Assembly of the Intermediate Fragments into Full Length Part Iand Part II of the Variola DNA Polymerase Gene

[0189] Parts I and II of the variola DNA polymerase gene were assembledin a second overlap extension reaction of their constituent IntermediateFragment sets using the reaction mixture provided in TABLE XI and thethermocycler program provided in TABLE XII. TABLE XI Reagent QuantityIntermediate Fragment 0 * 0.5 μL Intermediate Fragment 1 * 0.5 μLIntermediate Fragment 2 * 0.5 μL Intermediate Fragment 3 * 0.5 μLIntermediate Fragment 4 * 0.5 μL Leader primer, 50 μM 1.0 μL Trailerprimer, 50 μM 1.0 μL Sterile water, Roche (molecular biology grade) 20.5μL Roche High Fidelity PCR Master, 25.0 μL Vial #1 (Product # 2140314)

[0190] For the construction of Part I of the variola DNA polymerasegene, two overlap extension reactions were performed using the twodifferent sets of primers illustrated in FIG. 24A and FIG. 24G. Thefirst reaction used the thirty base 1 lead-01 leader (5′-TCCTCGAGCATAATGGATGTGCGTTGCATC-3′, SEQ. ID. NO.: 125) and the twenty-nine base1trail-57 trailer (5′-GCGGCAGCCA TAGGGCCCCTTAATCACCG-3′, SEQ. ID. NO.:349). The second reaction used the forty-eight base 1 lead-02 leader5′-GACGACGACGACAAGCATATGCTCGAGGATA TGGATGTGCGTTGCATC-3′, SEQ. ID. NO.:126) and the forty-seven base 1trail-58 trailer(5′-TTAAGCGTAATCCGGAACATCGTATGGGTAGGGCCCCTTAATCACCG-3′, SEQ. ID. NO.:350).

[0191] For the construction of Part II of the variola DNA polymerasegene, two overlap extension reactions were performed using two differentsets of primers illustrated in FIG. 25A and FIG. 25G. The first reactionused the thirty base 2lead-01 leader(5′-TCCTCGAGCATAGGGCCCCTGCTGAAATTG-3′, SEQ. ID. NO.: 183) and thetwenty-nine base 2trail-57 trailer (5′-GCGGCAGCCATACGCCTCATAGAAGGTCG-3′,SEQ. ID. NO.: 351). The second reaction used the forty-eight base2lead-02 leader (5′-GACGACGACGACAAGCATATGCTCGAGGATGGGCCCCTGCTGAAATTG-3′, SEQ. ID. NO.: 184) and the forty-seven base2trail-58 trailer (5′-TTAAGCGTAATCCGGAACATCGTATGGGTACGCCTCATAGAAGGTCG-3′, SEQ. ID. NO.: 352). TABLE XIIStep Conditions 1. 94° C. for 5 minutes for initial denaturation, 2. 20cycles of: 94° C. for 1 minute 60° C. for 30 seconds 65° C. for 10minutes 3. 65° C. for 10 minutes for final extension

[0192] The products of these reactions were separated on a 1% agarosegel, which is illustrated in FIG. 30.

[0193] Ligation of Part I and Part II to Generate the Full LengthVariola DNA Polymerase Gene (3000 bp)

[0194] The DNA from the lanes corresponding to Parts I and II of thevariola DNA polymerase gene was purified using GENECLEAN®(BIO101Systems®, Qbiogene). The two fragments were separately digestedovernight with ApaI (Boehringer Mannheim), purified on a 1% agarose gel,and purified using GENECLEAN®.

[0195] Parts I and II were ligated for 1.25 hr at ambient temperatureunder the conditions provided in TABLE XIII. The products werequantified by fluorometry. Ligation was confirmed by PCR using the 1lead-01 and 2 trail-57 primers, the product of which was isolated on a1% agarose gel, shown in FIG. 31. The PCR product was run in lane 1 and1 kB Plus ladder in lane 2. TABLE XIII Reagent Quantity Water 15 μL Part I (22 ng/μL) 1 μL Part II (22 ng/μL) 1 μL T4 10X ligation buffer 2μL T4 ligase (400 U/μL) 1 μL

EXAMPLE 3 E. coli Threonine Deaminase by one-Step HierarchicalDecomposition and Ligation

[0196] EXAMPLE 3 illustrates the synthesis of an E. coli threoninedeaminase gene by dividing the gene in a one-step hierarchicaldecomposition and synthesizing the gene by the direct self-assemblymethod.

[0197] Design

[0198] In the one-step hierarchical decomposition, the gene was divideddirectly into 54-overlapping short segments, in the present example notlonger than about 60 bases, overlap not shorter than about 27 bases.Because the gene was designed for reassembly by ligation, the adjacentshort segments on the same strand abut, i.e., with no single-strandedgaps between the double-stranded overlaps. The overlaps were notdesigned to terminate in a G or C.

[0199] Theoretical melting temperatures were calculated as described inEXAMPLE 1. The distribution of calculated melting temperatures of theshort segments using the most common E. coli codons is provided in FIG.32. The codons were permuted to increase the gap in the meltingtemperatures as described above, resulting in the calculated meltingtemperatures for the final short segments provided in FIG. 33. Thecalculated temperature gap in this example was 21.6° C. (77.1° C.-55.5°C.). The final sequence for the synthetic E. coli threonine deaminasegene is provided in FIG. 34 (SEQ. ID. NO.: 353) using the codon key fromFIG. 9 of EXAMPLE 1. A comparison of the codon usage of the designedthreonine deaminase gene compared with that of the native gene isillustrated in FIG. 35, indicating the similarity in codon frequency.

[0200] The resulting DNA sequence was decomposed into the 54 shortoverlapping segments shown in an overlap map in FIG. 36 (SEQ. ID. NO.:354-SEQ. ID. NO.: 407). Each short segment in FIG. 36 was synthesizeddirectly and assembled into the synthetic gene in one reaction step.

[0201] Biochemistry

[0202] The threonine deaminase gene was assembled in two steps by directself-assembly. The sequences of the leaders, short ssDNA segments, andtrailers are provided in FIG. 37A-FIG. 37F. SEQ. ID. NO.: 408-SEQ. ID.NO.: 411 correspond to the gene leaders illustrated in FIG. 37A, SEQ.ID. NO.: 412-SEQ. ID. NO.: 465 correspond to Seg-0-0 through Seg-0-53(FIG. 37B-FIG. 37E), and SEQ. ID. NO.: 466-SEQ. ID. NO.: 469 correspondto the gene trailers illustrated in FIG. 37F. Overlaps between adjacentsegments are not indicated. The DNA segments were purchased from acommercial source.

[0203] The threonine deaminase gene was first divided into fourMedium-sized pieces of equal size, which were each reassembled by directself-assembly and ligation, then the four Medium-sized pieces werecombined and ligated into the full-length gene.

[0204] Direct Self-Assembly of the Four Medium-Sized Pieces

[0205] Each of the four Medium-sized pieces was constructed in parallelas follows.

[0206] Annealing Reaction. The short segments were first annealed in athermocycler to form DNA constructs corresponding to the Medium-sizedpieces. Each medium-sized piece was divided into two parts, a forwardstrand and a reverse strand. For each strand, a 20 μL solution of theshort segments corresponding to each strand at a concentration of 0.825μM was treated with 800 U of T4 polynucleotide kinase. The kinased,short segments corresponding to the forward and reverse strands of eachMedium-sized piece were mixed together. A phenol extraction wasperformed on the mixture, followed by an ethanol precipitation and a 70%ethanol wash. The pellet was resuspended in 7.8 μL of 1×TE, from which asolution detailed in TABLE XIV was prepared. The concentration of eachshort segment was 1.65 μM in this solution. This solution was placed ina thermocycler, which was programmed as described in TABLE XV. TABLE XIVReagent Quantity Synthetic oligonucleotide mix 7.8 μL NaCl, 5 M 0.2 μLMgCl2, 1 M 1.0 μL Nuclease free water 1.0 μL

[0207] TABLE XV Step Conditions 1. 94° C. for 5 minutes for initialdenaturation, 2. 80° C. for 1 minute 3. Cool to 55° C. at 0.5° C./min 4. 4° C.

[0208] Ligation Reaction. Each Medium-sized piece was produced byligation of the corresponding DNA construct synthesized in the annealingreaction, described above. The reaction mixture provided in TABLE XVIwas maintained at 16° C. overnight. An agarose gel of the resulting fourMedium-sized pieces is provided in FIG. 38. TABLE XVI Reagent QuantityDNA construct solution from annealing reaction 10.0 μL  T4 DNA Ligase,400 U/mL 2.0 μL 10X Ligase Buffer 2.0 μL Nuclease free water 6.0 μL

[0209] Assembly of the Four Medium-Sized Pieces into the ThreonineDeaminase Gene

[0210] Each Medium-sized piece was isolated from the gel usingGENECLEAN® (BIO101 Systems®, Qbiogene). A ligation reaction mixturecontaining the four Medium-sized pieces is provided in TABLE XVII. Theligation reaction was performed at 16° C. overnight. An agarose gel ofthe products, including the full-length threonine deaminase gene, isprovided in FIG. 39. TABLE XVII Reagent Quantity DNA construct solutionfrom annealing reaction 10.0 μL  T4 DNA Ligase, 400 U/mL 2.0 μL 10XLigase Buffer 2.0 μL Nuclease free water 6.0 μL

[0211] At this point the assembled gene may be stored at 4° C. or clonedinto expression vector pET14b (Novagen) by ExoIII digestion. Thethreonine deaminase gene is designed to have 12 bp 3′-end overhangswhich are compatible to 5′-overhang regions of the pET14b vector afterit has been treated with ExollI for 1 minute at 14° C. The insert andvector are ligated by mixing and heating, followed by cooling to atemperature below Tm for the overlapping regions of the insert andvector. The annealed fragments are transformed into an E. coli host at37° C.

EXAMPLE 4 Ty3 GAG3 by Two-Step Recursive Decomposition with Sampling andSequencing

[0212] EXAMPLE 4 illustrates the synthesis of GAG3 by two-step recursivedecomposition with sampling and sequencing. The GAG3 open reading frame(ORF) is 876 bp long.

[0213] GAG3 ORF was divided into three overlapping intermediatefragments for reassembly by overlap extension. FIG. 40A-FIG. 40Eillustrate the sequences of the gene leader, Fragments 0-2, and genetrailer. Fragment 0 was 307 bp, Fragment 1 was 324 bp, and Fragment 2was 343 bp. Each intermediate fragment overlapped the adjacent one by 38nt. Each intermediate fragment was divided into 10 oligonucleotides (50nt) for reassembly by overlap extension. Adjacent oligonucleotidesoverlapped by about 19 bps. Collectively, the assembled sequence encodedboth strands of each of the intermediate fragments. SEQ. ID. NO.: 470corresponds to gene leader-0 (FIG. 40A), SEQ. ID. NO.: 471-SEQ. ID. NO.:480 correspond to Seg-0-0 through Seg-0-9 (FIG. 40B), SEQ. ID. NO.:481-SEQ. ID. NO.: 490 correspond to Seg-1-0 through Seg-1-9 (FIG. 40C),SEQ. ID. NO.: 491-SEQ. ID. NO.: 501 correspond to Seg-2-0 throughSeg-2-10 (FIG. 40D), SEQ. ID. NO.: 502 corresponds to strand 2 trailer(FIG. 40D), and SEQ. ID. NO.: 503 corresponds to gene trailer-0 (FIG.40E).

[0214] In the assembly of the intermediate fragments, theoligonucleotides were mixed to a final concentration of 0.1 μM with DNApolymerase (Proofstart®, Qiagen) and appropriate leader and trailersequences. FIG. 41 is an agarose gel showing the products of thesereactions. Fragment 0 is in lane 1, Fragment 1 in lane 2, and Fragment 2in lane 3. Lane 4 contains a 2-Log DNA molecular weight ladder (NewEngland Biolabs).

[0215] The intermediate fragments were each cloned using a blunt-endligation procedure (pCR-Blunt II-TOPO® vector, Invitrogen). Four clonesof each intermediate fragment were sequenced and correct sequence wasselected. The selected sequences were amplified out of the vector by PCR(Proofstart® DNA polymerase, Qiagen).

[0216] The intermediate fragments were mixed and extended to full-duplexDNA as described for the oligonucleotides. FIG. 42 is an agarose gel ofthe full-length GAG3 gene. The identity and accuracy of the gene wasverified by sequencing both strands of the DNA product.

[0217] The synthetic GAG3 gene was cloned into the pET-3a plasmid(Novagen) using NdeI and BamHI endonuclease restriction sites designedin the 5′ and 3′ PCR gene primers. The resulting plasmid contained theentire GAG3 gene under the control of an inducible T7 promoter, and abacterial ribosome-binding site (Shine-Dalgarno sequence). The BL21(DE3)pLysS strain of E. coli (Novagen) was transformed with the plasmid. T7RNA polymerase expression was induced using host-encoded isopropyl-1-thio-β-D-galactopyranoside at a concentration of 0.4 mM. At 30 minintervals, cells were harvested by centrifugation and sonicated. FIG. 43is an SDS-PAGE gel of the sonicate stained with Coomassie indicating theexpression of Gag3p.

EXAMPLE 5 Ty3 IN Gene by Two-Step Recursive Decomposition with Samplingand Sequencing

[0218] EXAMPLE 5 illustrates the synthesis of the Ty3 IN gene by atwo-step recursive decomposition with sampling and sequencing. The Ty3IN gene is 1640 bp long and is illustrated in FIG. 44 (SEQ. ID. NO.:504).

[0219] The Ty3 IN gene was divided into ten intermediate fragments forreassembly by overlap extension. Overlap maps for the leader, tenintermediate fragments, and trailer are provided in FIG. 45A-FIG. 45L:leader (FIG. 45A, SEQ. ID. NO.: 505), Fragment 0 (196 bp, FIG. 45B, SEQ.ID. NO.: 506-SEQ. ID. NO.: 513), Fragment 1 (224 bp, FIG. 45C, SEQ. ID.NO.: 514-SEQ. ID. NO.: 521), Fragment 2 (224 bp, FIG. 45D, SEQ. ID. NO.:522-SEQ. ID. NO.: 529), Fragment 3 (223 bp, FIG. 45E, SEQ. ID. NO.:530-SEQ. ID. NO.: 537), Fragment 4 (227 bp, FIG. 45F, SEQ. ID. NO.:538-SEQ. ID. NO.: 545), Fragment 5 (223 bp, FIG. 45G, SEQ. ID. NO.:546-SEQ. ID. NO.: 553), Fragment 6 (224 bp, FIG. 45H, SEQ. ID. NO.:554-SEQ. ID. NO.: 561), Fragment 7 (172 bp, FIG. 451, SEQ. ID. NO.:562-SEQ. II). NO.: 567), Fragment 8 (175 bp, FIG. 45J, SEQ. ID. NO.:568-SEQ. ID. NO.: 573), Fragment 9 (174 bp, FIG. 45K, SEQ. ID. NO.:574-SEQ. ID. NO.: 580), and trailer (FIG. 45L, SEQ. ID. NO.: 581). Eachof the intermediate fragments was divided into 50 nt oligonucleotidesfor reassembly by direct self-assembly illustrated in FIG. 46A-FIG. 46L.SEQ. ID. NO.: 582 corresponds to the gene leader (FIG. 46A), SEQ. ID.NO.: 583-SEQ. ID. NO.: 590 correspond to Seg-0-0 through Seg-0-7 (FIG.46B), SEQ. ID. NO.: 591-SEQ. ID. NO.: 598 correspond to Seg-1-0 throughSeg-1-7 (FIG. 46C), SEQ. ID. NO.: 599-SEQ. ID. NO.: 606 correspond toSeg-2-0 through Seg-2-7 (FIG. 46D), SEQ. ID. NO.: 607-SEQ. ID. NO.: 614correspond to Seg-3-0 through Seg-3-7 (FIG. 46E), SEQ. ID. NO.: 615-SEQ.ID. NO.: 622 correspond to Seg-4-0 through Seg-4-7 (FIG. 46F), SEQ. ID.NO.: 623-SEQ. ID. NO.: 630 correspond to Seg-5-0 through Seg-5-7 (FIG.46G), SEQ. ID. NO.: 631-SEQ. ID. NO.: 638 correspond to Seg-6-0 throughSeg-6-7 (FIG. 46H), SEQ. ID. NO.: 639-SEQ. ID. NO.: 644 correspond toSeg-7-0 through Seg-7-5 (FIG. 46I), SEQ. ID. NO.: 645-SEQ. ID. NO.: 650correspond to Seg-8-0 through Seg-8-5 (FIG. 46J), SEQ. ID. NO.: 651-SEQ.ID. NO.: 657 correspond to Seg-9-0 through Seg-9-6 (FIG. 46K), SEQ. ID.NO.: 658 corresponds to the strand 9 trailer (FIG. 46K), and SEQ. ID.NO.: 659 corresponds to the gene trailer (FIG. 46L).

[0220] The ten intermediate fragments were separately assembled from theoligonucleotides, cloned, and sequenced as described in EXAMPLE 4. FIG.47 is an agarose gel showing the products of the ten intermediatefragment reactions. The ten intermediate fragments were reassembled intothe Ty3 IN gene using overlap extension as described in EXAMPLE 4. Anagarose gel of the product is provided in FIG. 48. Lane 1 is thesynthetic Ty3 IN gene, and lane 2, DNA size markers. The synthetic TY3IN gene was cloned and expressed as described in EXAMPLE 4. An SDS-PAGEgel of the Coomassie stained TY3 IN protein is provided in FIG. 49. Lane1 contains molecular weight markers; lane 2, uninduced cellular extract;and lane 3, induced cellular extract.

[0221] The embodiments illustrated and described herein are provided asexamples of certain preferred embodiments. Various changes andmodifications can be made to the embodiments presented herein by thoseskilled in the art without departure from the disclosure.

1 659 1 1542 DNA Artificial Sequence Synthetic DNA 1 atggccgattctcaacctct gtctggagca cctgaaggag cagaatattt acgggcagtg 60 ttacgtgcgccggtgtatga agccgcccag gtgaccccgt tacagaaaat ggaaaaactc 120 agttcccgtctcgataatgt gattctggtc aagcgcgagg accgacagcc cgtgcactcg 180 ttcaagctccgtggtgcgta tgcgatgatg gcagggttga cggaagaaca gaaagcccac 240 ggtgtgattacggcatcagc tggcaaccat gctcaaggtg tggcgttctc ttctgctcga 300 ctgggagtgaaagcgttaat cgtgatgcct actgctacag cggatattaa agtggatgcc 360 gtccgagggtttggtggtga agttctgctg catggcgcga actttgatga agccaaggcc 420 aaggcgatcgagctctctca acaacagggg ttcacgtggg tgccaccatt cgatcatccg 480 atggtaatcgccggtcaggg gacgttagca ctggagttgc ttcaacagga cgcacatctc 540 gaccgggtcttcgttcctgt tgggggtggt ggtctggcgg cgggtgtagc agtactcatc 600 aagcagctcatgccacaaat taaagtgata gccgttgaag ccgaagattc cgcatgtctg 660 aaggccgcacttgatgccgg acaccctgtc gatctgccgc gtgtggggct gtttgcagaa 720 ggggttgcggtgaaacggat tggggatgag accttccgcc tatgccagga gtatttggac 780 gacatcatcaccgtggactc cgatgccatt tgtgccgcca tgaaggacct attcgaagat 840 gtccgtgcagtcgccgaacc gtctggagct ttagcattag ccgggatgaa gaagtacatt 900 gctctgcacaacatccgagg cgaacgactg gcccacatct taagcggtgc gaatgtcaac 960 ttccacggcttacggtatgt gtctgagcgt tgcgagctgg gcgaacaaag agaagcatta 1020 ctggcagtgaccattccgga agaaaaaggt tcgttcctca agttctgcca gctgttagga 1080 ggtcggagcgtcacggaatt taactatcgg tttgcagacg ccaagaatgc ctgtattttt 1140 gtgggtgtgaggttgagcag gggattggag gagcgcaagg agattcttca gatgctgaac 1200 gatggcggttatagcgtggt ggacctgagc gacgacgaaa tggctaaact acacgtacgc 1260 tacatggtgggtggacgacc ttcacatccc ctccaggagc gactgtattc ctttgaattc 1320 ccagagtctcccggcgcctt attacgtttc ttaaacaccc tgggcaccta ttggaatatc 1380 agcctgttccactaccgatc tcatgggacg gattacgggc gtgttctgct ggcgtttgag 1440 cttggcgatcatgaaccgga ctttgaaacg cgcctgaacg aactgggcta tgattgccat 1500 gatgagaccaacaaccccgc ctttcgtttc ttcctcgcag gc 1542 2 50 DNA Artificial SequenceSynthetic DNA 2 ctatactgca gatggccgat tctcaaccac tgtctggagc tcctgaaggg50 3 50 DNA Artificial Sequence Synthetic DNA 3 gtctggagct cctgaaggggcagaatattt acgggcagtg ttacgtgcgc 50 4 47 DNA Artificial SequenceSynthetic DNA 4 gggcagtgtt acgtgcgccg gtgtatgaag ccgcccaggt gaccccg 47 549 DNA Artificial Sequence Synthetic DNA 5 gccgcccagg tgaccccgttacagaaaatg gaaaaactct cctcccggc 49 6 49 DNA Artificial SequenceSynthetic DNA 6 gaaaaactct cctcccggct cgataatgtg attctggtca agcgcgagg 497 50 DNA Artificial Sequence Synthetic DNA 7 gattctggtc aagcgcgaggaccgtcagcc cgtgcactcg ttcaagctcc 50 8 50 DNA Artificial SequenceSynthetic DNA 8 gtgcactcgt tcaagctccg tggtgcctat gcgatgatgg cgggcctgac50 9 50 DNA Artificial Sequence Synthetic DNA 9 gatgatggcg ggcctgacggaagaacagaa agcccacggt gtgattacgg 50 10 50 DNA Artificial SequenceSynthetic DNA 10 cccacggtgt gattacggca tcagcaggca accatgctca aggtgtggcg50 11 49 DNA Artificial Sequence Synthetic DNA 11 catgctcaag gtgtggcgttctcttctgct cgactgggag tgaaagcgc 49 12 50 DNA Artificial SequenceSynthetic DNA 12 gactgggagt gaaagcgctg attgtgatgc ctacagctac tcgagaatac50 13 49 DNA Artificial Sequence Synthetic DNA 13 ctatactgca gagtgaaagcgctgattgtg atgcctacag ctacagccg 49 14 50 DNA Artificial SequenceSynthetic DNA 14 gatgcctaca gctacagccg atattaaagt ggatgcggtg cgtggcttcg50 15 48 DNA Artificial Sequence Synthetic DNA 15 gatgcggtgc gtggcttcggtggtgaagtt ctgctgcatg gcgcgaac 48 16 49 DNA Artificial SequenceSynthetic DNA 16 ctgctgcatg gcgcgaactt tgatgaagcc aaggccaagg cgatcgagc49 17 50 DNA Artificial Sequence Synthetic DNA 17 caaggccaag gcgatcgagctctctcaaca acaggggttc acgtgggtgc 50 18 50 DNA Artificial SequenceSynthetic DNA 18 caggggttca cgtgggtgcc accgtttgat catccgatgg tcatcgccgg50 19 49 DNA Artificial Sequence Synthetic DNA 19 catccgatgg tcatcgccggtcaaggcacg ttagcgctgg agttgcttc 49 20 50 DNA Artificial SequenceSynthetic DNA 20 gttagcgctg gagttgcttc aacaggacgc acacctcgac cgggtcttcg50 21 50 DNA Artificial Sequence Synthetic DNA 21 cacctcgacc gggtcttcgttcctgttggg ggtggtggtc tggcggcggg 50 22 49 DNA Artificial SequenceSynthetic DNA 22 gtggtggtct ggcggcgggg gtagcagtac tcatcaagca gctcatgcc49 23 49 DNA Artificial Sequence Synthetic DNA 23 catcaagcag ctcatgccacaaattaaagt gatagccgtt gaagcctcg 49 24 26 DNA Artificial SequenceSynthetic DNA 24 gatagccgtt gaagcctcga gaatac 26 25 48 DNA ArtificialSequence Synthetic DNA 25 ctatactgca gctcatgcca caaattaaag tgatagccgttgaagccg 48 26 49 DNA Artificial Sequence Synthetic DNA 26 gtgatagccgttgaagccga agattccgca tgcctgaagg ccgcacttg 49 27 50 DNA ArtificialSequence Synthetic DNA 27 gcctgaaggc cgcacttgac gccggacatc cagtcgacctgccgcgcgtg 50 28 50 DNA Artificial Sequence Synthetic DNA 28 gtcgacctgccgcgcgtggg gctgtttgca gaaggggttg cggtgaaacg 50 29 49 DNA ArtificialSequence Synthetic DNA 29 gaaggggttg cggtgaaacg gattggggat gagaccttccgcctatgcc 49 30 49 DNA Artificial Sequence Synthetic DNA 30 gagaccttccgcctatgcca ggagtatttg gacgacatca tcaccgtgg 49 31 49 DNA ArtificialSequence Synthetic DNA 31 gacgacatca tcaccgtgga ctccgatgcc atttgtgccgccatgaagg 49 32 50 DNA Artificial Sequence Synthetic DNA 32 catttgtgccgccatgaagg acctattcga ggatgtccgt gcagtcgccg 50 33 48 DNA ArtificialSequence Synthetic DNA 33 gatgtccgtg cagtcgccga accgtctgga gctctcgcactggccggg 48 34 50 DNA Artificial Sequence Synthetic DNA 34 gctctcgcactggccgggat gaagaagtac attgctctgc acaacatccg 50 35 49 DNA ArtificialSequence Synthetic DNA 35 cattgctctg cacaacatcc gaggcgaacg actggcccacatcctgagc 49 36 28 DNA Artificial Sequence Synthetic DNA 36 ctggcccacatcctgagctc gagaatac 28 37 50 DNA Artificial Sequence Synthetic DNA 37ctatactgca gcacaacatc cgaggcgaac gactggccca catcctgagc 50 38 48 DNAArtificial Sequence Synthetic DNA 38 ctggcccaca tcctgagcgg tgcgaatgtcaacttccacg gcttacgg 48 39 50 DNA Artificial Sequence Synthetic DNA 39caacttccac ggcttacggt atgtgtctga gcgttgcgag ctgggcgaac 50 40 50 DNAArtificial Sequence Synthetic DNA 40 gttgcgagct gggcgaacaa cgcgaagcattactggcagt gaccattccg 50 41 49 DNA Artificial Sequence Synthetic DNA 41ctggcagtga ccattccgga agaaaaaggt tcgttcctca agttctgcc 49 42 45 DNAArtificial Sequence Synthetic DNA 42 cgttcctcaa gttctgccag ctgttaggaggtcggagcgt cacgg 45 43 47 DNA Artificial Sequence Synthetic DNA 43gaggtcggag cgtcacggaa tttaactatc ggtttgcaga cgccaag 47 44 50 DNAArtificial Sequence Synthetic DNA 44 cggtttgcag acgccaagaa tgcctgtatttttgtgggtg tgaggttgag 50 45 50 DNA Artificial Sequence Synthetic DNA 45gtatttttgt gggtgtgagg ttgagcaggg gattggagga gcgcaaggag 50 46 46 DNAArtificial Sequence Synthetic DNA 46 gattggagga gcgcaaggag attcttcagatgctgaacga tggcgg 46 47 47 DNA Artificial Sequence Synthetic DNA 47gatgctgaac gatggcggtt atagcgtggt ggacctgagc gacgacg 47 48 36 DNAArtificial Sequence Synthetic DNA 48 gtggacctga gcgacgacga aatggctcgagaatac 36 49 47 DNA Artificial Sequence Synthetic DNA 49 ctatactgcagttatagcgt ggtggacctg agcgacgacg aaatggc 47 50 50 DNA ArtificialSequence Synthetic DNA 50 gagcgacgac gaaatggcta aactacacgt acgctacatggtgggtggac 50 51 47 DNA Artificial Sequence Synthetic DNA 51 gctacatggtgggtggacga ccttcacatc ccctccagga gcgactg 47 52 50 DNA ArtificialSequence Synthetic DNA 52 cccctccagg agcgactgta ttcctttgaa ttcccagagtctcccggcgc 50 53 49 DNA Artificial Sequence Synthetic DNA 53 cccagagtctcccggcgcct tattacgttt cttaaacacc ctgggcacc 49 54 48 DNA ArtificialSequence Synthetic DNA 54 cttaaacacc ctgggcacct attggaatat cagcctgttccactaccg 48 55 50 DNA Artificial Sequence Synthetic DNA 55 cagcctgttccactaccgat ctcacggcac ggattacggg cgtgttctgg 50 56 49 DNA ArtificialSequence Synthetic DNA 56 gattacgggc gtgttctggc ggcgtttgaa ctgggcgatcatgaaccgg 49 57 48 DNA Artificial Sequence Synthetic DNA 57 ctgggcgatcatgaaccgga ctttgaaacg cgcctgaacg aactgggc 48 58 50 DNA ArtificialSequence Synthetic DNA 58 cgcctgaacg aactgggcta tgattgccat gatgagaccaacaaccccgc 50 59 49 DNA Artificial Sequence Synthetic DNA 59 gatgagaccaacaaccccgc ctttcgtttc ttcctcgccg gctaactcg 49 60 27 DNA ArtificialSequence Synthetic DNA 60 cttcctcgcc ggctaactcg agaatac 27 61 36 DNAArtificial Sequence Synthetic DNA 61 ggccgattct caacctctgt ctggagcacctgaagg 36 62 50 DNA Artificial Sequence Synthetic DNA 62 gcgcacgtaacactgcccgt aaatattctg ctccttcagg tgctccagac 50 63 47 DNA ArtificialSequence Synthetic DNA 63 gggcagtgtt acgtgcgccg gtgtatgaag ccgcccaggtgaccccg 47 64 49 DNA Artificial Sequence Synthetic DNA 64 gacgggaactgagtttttcc attttctgta acggggtcac ctgggcggc 49 65 49 DNA ArtificialSequence Synthetic DNA 65 gaaaaactca gttcccgtct cgataatgtg attctggtcaagcgcgagg 49 66 50 DNA Artificial Sequence Synthetic DNA 66 ggagcttgaacgagtgcacg ggctgtcggt cctcgcgctt gaccagaatc 50 67 50 DNA ArtificialSequence Synthetic DNA 67 gtgcactcgt tcaagctccg tggtgcgtat gcgatgatggcagggttgac 50 68 50 DNA Artificial Sequence Synthetic DNA 68 ccgtaatcacaccgtgggct ttctgttctt ccgtcaaccc tgccatcatc 50 69 50 DNA ArtificialSequence Synthetic DNA 69 cccacggtgt gattacggca tcagctggca accatgctcaaggtgtggcg 50 70 48 DNA Artificial Sequence Synthetic DNA 70 cgctttcactcccagtcgag cagaagagaa cgccacacct tgagcatg 48 71 41 DNA ArtificialSequence Synthetic DNA 71 cgactgggag tgaaagcgtt aatcgtgatg cctactgcta c41 72 39 DNA Artificial Sequence Synthetic DNA 72 gagtgaaagc gttaatcgtgatgcctactg ctacagcgg 39 73 50 DNA Artificial Sequence Synthetic DNA 73caaaccctcg gacggcatcc actttaatat ccgctgtagc agtaggcatc 50 74 48 DNAArtificial Sequence Synthetic DNA 74 gatgccgtcc gagggtttgg tggtgaagttctgctgcatg gcgcgaac 48 75 49 DNA Artificial Sequence Synthetic DNA 75gctcgatcgc cttggccttg gcttcatcaa agttcgcgcc atgcagcag 49 76 50 DNAArtificial Sequence Synthetic DNA 76 caaggccaag gcgatcgagc tctctcaacaacaggggttc acgtgggtgc 50 77 50 DNA Artificial Sequence Synthetic DNA 77ccggcgatta ccatcggatg atcgaatggt ggcacccacg tgaacccctg 50 78 49 DNAArtificial Sequence Synthetic DNA 78 catccgatgg taatcgccgg tcaggggacgttagcactgg agttgcttc 49 79 50 DNA Artificial Sequence Synthetic DNA 79cgaagacccg gtcgagatgt gcgtcctgtt gaagcaactc cagtgctaac 50 80 50 DNAArtificial Sequence Synthetic DNA 80 catctcgacc gggtcttcgt tcctgttgggggtggtggtc tggcggcggg 50 81 49 DNA Artificial Sequence Synthetic DNA 81ggcatgagct gcttgatgag tactgctaca cccgccgcca gaccaccac 49 82 46 DNAArtificial Sequence Synthetic DNA 82 catcaagcag ctcatgccac aaattaaagtgatagccgtt gaagcc 46 83 38 DNA Artificial Sequence Synthetic DNA 83gctcatgcca caaattaaag tgatagccgt tgaagccg 38 84 49 DNA ArtificialSequence Synthetic DNA 84 caagtgcggc cttcagacat gcggaatctt cggcttcaacggctatcac 49 85 50 DNA Artificial Sequence Synthetic DNA 85 gtctgaaggccgcacttgat gccggacacc ctgtcgatct gccgcgtgtg 50 86 50 DNA ArtificialSequence Synthetic DNA 86 cgtttcaccg caaccccttc tgcaaacagc cccacacgcggcagatcgac 50 87 49 DNA Artificial Sequence Synthetic DNA 87 gaaggggttgcggtgaaacg gattggggat gagaccttcc gcctatgcc 49 88 49 DNA ArtificialSequence Synthetic DNA 88 ccacggtgat gatgtcgtcc aaatactcct ggcataggcggaaggtctc 49 89 49 DNA Artificial Sequence Synthetic DNA 89 gacgacatcatcaccgtgga ctccgatgcc atttgtgccg ccatgaagg 49 90 50 DNA ArtificialSequence Synthetic DNA 90 cggcgactgc acggacatct tcgaataggt ccttcatggcggcacaaatg 50 91 48 DNA Artificial Sequence Synthetic DNA 91 gatgtccgtgcagtcgccga accgtctgga gctttagcat tagccggg 48 92 50 DNA ArtificialSequence Synthetic DNA 92 cggatgttgt gcagagcaat gtacttcttc atcccggctaatgctaaagc 50 93 49 DNA Artificial Sequence Synthetic DNA 93 cattgctctgcacaacatcc gaggcgaacg actggcccac atcttaagc 49 94 40 DNA ArtificialSequence Synthetic DNA 94 gcacaacatc cgaggcgaac gactggccca catcttaagc 4095 48 DNA Artificial Sequence Synthetic DNA 95 ccgtaagccg tggaagttgacattcgcacc gcttaagatg tgggccag 48 96 50 DNA Artificial SequenceSynthetic DNA 96 caacttccac ggcttacggt atgtgtctga gcgttgcgag ctgggcgaac50 97 50 DNA Artificial Sequence Synthetic DNA 97 cggaatggtc actgccagtaatgcttctct ttgttcgccc agctcgcaac 50 98 49 DNA Artificial SequenceSynthetic DNA 98 ctggcagtga ccattccgga agaaaaaggt tcgttcctca agttctgcc49 99 45 DNA Artificial Sequence Synthetic DNA 99 ccgtgacgct ccgacctcctaacagctggc agaacttgag gaacg 45 100 47 DNA Artificial Sequence SyntheticDNA 100 gaggtcggag cgtcacggaa tttaactatc ggtttgcaga cgccaag 47 101 50DNA Artificial Sequence Synthetic DNA 101 ctcaacctca cacccacaaaaatacaggca ttcttggcgt ctgcaaaccg 50 102 50 DNA Artificial SequenceSynthetic DNA 102 gtatttttgt gggtgtgagg ttgagcaggg gattggagga gcgcaaggag50 103 46 DNA Artificial Sequence Synthetic DNA 103 ccgccatcgttcagcatctg aagaatctcc ttgcgctcct ccaatc 46 104 47 DNA ArtificialSequence Synthetic DNA 104 gatgctgaac gatggcggtt atagcgtggt ggacctgagcgacgacg 47 105 26 DNA Artificial Sequence Synthetic DNA 105 gccatttcgtcgtcgctcag gtccac 26 106 37 DNA Artificial Sequence Synthetic DNA 106gttatagcgt ggtggacctg agcgacgacg aaatggc 37 107 50 DNA ArtificialSequence Synthetic DNA 107 gtccacccac catgtagcgt acgtgtagtt tagccatttcgtcgtcgctc 50 108 47 DNA Artificial Sequence Synthetic DNA 108gctacatggt gggtggacga ccttcacatc ccctccagga gcgactg 47 109 50 DNAArtificial Sequence Synthetic DNA 109 gcgccgggag actctgggaa ttcaaaggaatacagtcgct cctggagggg 50 110 49 DNA Artificial Sequence Synthetic DNA110 cccagagtct cccggcgcct tattacgttt cttaaacacc ctgggcacc 49 111 48 DNAArtificial Sequence Synthetic DNA 111 cggtagtgga acaggctgat attccaataggtgcccaggg tgtttaag 48 112 50 DNA Artificial Sequence Synthetic DNA 112cagcctgttc cactaccgat ctcatgggac ggattacggg cgtgttctgg 50 113 49 DNAArtificial Sequence Synthetic DNA 113 ccggttcatg atcgccaagc tcaaacgctgccagaacacg cccgtaatc 49 114 48 DNA Artificial Sequence Synthetic DNA 114cttggcgatc atgaaccgga ctttgaaacg cgcctgaacg aactgggc 48 115 50 DNAArtificial Sequence Synthetic DNA 115 gcggggttgt tggtctcatc atggcaatcatagcccagtt cgttcaggcg 50 116 45 DNA Artificial Sequence Synthetic DNA116 gatgagacca acaaccccgc ctttcgtttc ttcctcgcag gctaa 45 117 17 DNAArtificial Sequence Synthetic DNA 117 gtagcagtag gcatcac 17 118 18 DNAArtificial Sequence Synthetic DNA 118 ggcttcaacg gctatcac 18 119 18 DNAArtificial Sequence Synthetic DNA 119 gcttaagatg tgggccag 18 120 20 DNAArtificial Sequence Synthetic DNA 120 ttagcctgcg aggaagaaac 20 121 45DNA Artificial Sequence Synthetic DNA 121 ctatatctag catatggccgattctcaacc tctgtctgga gcacc 45 122 45 DNA Artificial Sequence SyntheticDNA 122 gtattggatc cttagcctgc gaggaagaaa cgaaaggcgg ggttg 45 123 1512DNA Artificial Sequence Synthetic DNA 123 atggatgtgc gttgcatcaattggtttgaa tcgcatggtg aaaacaggtt tttatatctg 60 aaaagccgct gtcgtaatggggaaactgtg ttcattcgct tccctcacta cttttactat 120 gtggtgaccg atgagatctaccagagctta gcccccccac ctttcaacgc tcgtcctatg 180 ggtaaaatgc ggaccattgacatcgatgag accatctcgt acaacctgga catcaaggat 240 cgtaaatgct ctgtggcggacatgtggtta attgaagagc cgaaaaagcg caacattcag 300 aatgccacca tggatgagtttctgaatatt tcttggttct acatcagcaa cggcatttct 360 ccggatggat gctacagcttggacgatcag tatctcacga aaatcaacaa cgggtgctat 420 cattgtggcg accctcgtaactgttttgcg aaagagatcc cccgttttga cattccgaga 480 agctatctgt tcctggacattgaatgccat ttcgataaga agttcccgag cgtttttatt 540 aatccgatca gccatacctcctattgttat attgatctga gcggcaaacg tctgctgttt 600 accctgatca acgaggagatgctgaccgaa caagaaatcc aggaggccgt ggatcgtggc 660 tgtctgcgca ttcagtccttgatggagatg gattatgaac gtgaactggt gctgtgctct 720 gaaattgtgc tgctccaaatcgccaaacag ttattagagc tgacctttga ttacatcgtg 780 acgttcaacg gccacaacttcgatctgcgg tatattacca atcgtctcga gctgttgacc 840 ggcgaaaaaa tcatctttcgtagccccgac aagaaagaag cggttcacct gtgcatctat 900 gagcgtaatc agtcgagccacaaaggggtt ggagggatgg cgaatacgac cttccacgtc 960 aataataata atggcaccatttttttcgac ctgtattctt tcatccagaa atcggagaag 1020 cttgattctt acaaactggacagcatcagc aaaaacgcct tttcgtgcat gggcaaagtg 1080 ctgaatcgtg gtgtgcgtgagatgaccttt atcggtgatg ataccactga tgcgaaaggg 1140 aaagcggctg tgtttgcgaaggtcctcacc acaggcaatt acgtgacggt cgatgatatc 1200 atttgtaaag tgattcacaaggacatctgg gaaaatggct ttaaggtggt gttgagctgt 1260 ccgactctga ccaacgacacgtacaaactc tcctttggta aagatgatgt cgacctggcg 1320 cagatgtata aagactataacctgaacatc gcccttgata tggcccgcta ttgcatccac 1380 gacgcctgtc tgtgccaatacctgtgggag tactatggtg tagagacgaa aacggatgcg 1440 ggtgcctcta cctatgtgttgcctcagtcc atggtgtttg agtataaagc gagcacggtg 1500 attaaggggc cc 1512 1241509 DNA Artificial Sequence Synthetic DNA 124 gggcccctgc tgaaattgctgctggaaacc aagaccatct tagtgcgctc tgaaaccaaa 60 caaaagttcc cctatgaaggcggtaaagtt tttgccccga agcagaagat gtttagtaac 120 aacgtcctga tctttgactacaactctctg tatcccaacg tgtgcatctt tggcaactta 180 agtccggaaa ccctggttggcgtggtggtg tcttcgaacc gcttggaaga agagattaac 240 aaccagctgc tcctgcaaaagtacccgccg ccacgttaca ttacggtgca ctgcgaacca 300 cgtttaccca acctgatcagcgagattgcc atttttgatc ggagcattga aggcaccatt 360 ccgcgtttac tgagaacctttctggccgag cgtgcgcgtt ataagaaaat gctgaaacag 420 gcgaccagtt ctacggaaaaagccatctac gacagtatgc agtacaccta caagatcatc 480 gcgaatagtg tgtatggcttgatgggtttt cgcaactctg ccttgtatag ctatgccagc 540 gctaagagtt gtaccagtattggccgtcgt atgatcctgt atctggaatc tgtactcaat 600 ggagcggaac tgagtaatggcatgcttcgt tttgcaaacc cgttaagtaa tccgttctac 660 atggatgatc gcgacattaacccgattgtg aagacgtccc tgccgattga ctaccgtttt 720 cgcttcagga gtgtctatggtgataccgac tccgtgttta ccgaaattga cagccaggat 780 gttgacaaaa gtattgagatagcgaaggag ctggaacgtc tgatcaactc tcgtgtgctg 840 ttcaacaact ttaagatcgagtttgaggcc gtgtataaaa acctgatcat gcagagcaag 900 aaaaaatata ccacgatgaagtatagcgcg agttctaact ccaaaagtgt gccggagcgt 960 attaacaagg ggactagcgaaacccgtcgt gatgtcagca agttccacaa aaacatgatt 1020 aaaatttaca agacccgtttgagcgaaatg ttaagtgaag gccggatgaa cagcaaccag 1080 gtgtgtatcg acattctgcgttcccttgaa acggatcttc gtagcgagtt cgacagccga 1140 tctagcccgt tggaactgttcatgttaagc cgcatgcacc acttgaacta taaaagcgcc 1200 gataacccga acatgtacctggtgaccgag tacaacaaaa acaacccgga aactattgaa 1260 cttggcgaac gctactactttgcctatatc tgtccggcga atgttccgtg gaccaaaaaa 1320 ctcgtgaaca tcaagacgtacgaaaccatt attgaccgtt ccttcaagct gggctcagat 1380 cagcgcattt tttacgaggtgtattttaaa cgtctgacct ccgaaatcgt gaacctgtta 1440 gataacaagg tgctgtgcatttcttttttt gaacgcatgt ttggcagcag accgaccttc 1500 tatgaggcg 1509 125 30DNA Artificial Sequence Synthetic DNA 125 tcctcgagca taatggatgtgcgttgcatc 30 126 48 DNA Artificial Sequence Synthetic DNA 126gacgacgacg acaagcatat gctcgaggat atggatgtgc gttgcatc 48 127 48 DNAArtificial Sequence Synthetic DNA 127 atggatgtgc gttgcatcaa ttggtttgaatcgcatggtg aaaacagg 48 128 50 DNA Artificial Sequence Synthetic DNA 128cgcatggtga aaacaggttt ttatatctga aaagccgctg tcgtaatggg 50 129 47 DNAArtificial Sequence Synthetic DNA 129 gccgctgtcg taatggggaa actgtgttcattcgcttccc tcactac 47 130 50 DNA Artificial Sequence Synthetic DNA 130cattcgcttc cctcactact tttactatgt ggtgaccgat gagatctacc 50 131 50 DNAArtificial Sequence Synthetic DNA 131 ggtgaccgat gagatctacc agagcttagcccccccacct ttcaacgctc 50 132 50 DNA Artificial Sequence Synthetic DNA132 cccacctttc aacgctcgtc ctatgggtaa aatgcggacc attgacatcg 50 133 50 DNAArtificial Sequence Synthetic DNA 133 gcggaccatt gacatcgatg agaccatctcgtacaacctg gacatcaagg 50 134 49 DNA Artificial Sequence Synthetic DNA134 cgtacaacct ggacatcaag gatcgtaaat gctctgtggc ggacatgtg 49 135 46 DNAArtificial Sequence Synthetic DNA 135 ctctgtggcg gacatgtggt taattgaagagccgaaaaag cgcaac 46 136 48 DNA Artificial Sequence Synthetic DNA 136gagccgaaaa agcgcaacat tcagaatgcc accatggatg agtttctg 48 137 50 DNAArtificial Sequence Synthetic DNA 137 ccaccatgga tgagtttctg aatatttcttggttctacat cagcaacggc 50 138 50 DNA Artificial Sequence Synthetic DNA138 ggttctacat cagcaacggc atttctccgg atggatgcta cagcttggac 50 139 50 DNAArtificial Sequence Synthetic DNA 139 ggttctacat cagcaacggc atttctccggatggatgcta cagcttggac 50 140 50 DNA Artificial Sequence Synthetic DNA140 ggatgctaca gcttggacga tcagtatctc acgaaaatca acaacgggtg 50 141 48 DNAArtificial Sequence Synthetic DNA 141 cacgaaaatc aacaacgggt gctatcattgtggcgaccct cgtaactg 48 142 45 DNA Artificial Sequence Synthetic DNA 142ggcgaccctc gtaactgttt tgcgaaagag atcccccgtt ttgac 45 143 49 DNAArtificial Sequence Synthetic DNA 143 gagatccccc gttttgacat tccgagaagctatctgttcc tggacattg 49 144 50 DNA Artificial Sequence Synthetic DNA 144gctatctgtt cctggacatt gaatgccatt tcgataagaa gttcccgagc 50 145 49 DNAArtificial Sequence Synthetic DNA 145 cgataagaag ttcccgagcg tttttattaatccgatcagc catacctcc 49 146 50 DNA Artificial Sequence Synthetic DNA 146cgatcagcca tacctcctat tgttatattg atctgagcgg caaacgtctg 50 147 50 DNAArtificial Sequence Synthetic DNA 147 ctgagcggca aacgtctgct gtttaccctgatcaacgagg agatgctgac 50 148 50 DNA Artificial Sequence Synthetic DNA148 gatcaacgag gagatgctga ccgaacaaga aatccaggag gccgtggatc 50 149 49 DNAArtificial Sequence Synthetic DNA 149 ccaggaggcc gtggatcgtg gctgtctgcgcattcagtcc ttgatggag 49 150 50 DNA Artificial Sequence Synthetic DNA 150gcattcagtc cttgatggag atggattatg aacgtgaact ggtgctgtgc 50 151 50 DNAArtificial Sequence Synthetic DNA 151 gcattcagtc cttgatggag atggattatgaacgtgaact ggtgctgtgc 50 152 48 DNA Artificial Sequence Synthetic DNA152 gtgaactggt gctgtgctct gaaattgtgc tgctccaaat cgccaaac 48 153 49 DNAArtificial Sequence Synthetic DNA 153 gctccaaatc gccaaacagt tattagagctgacctttgat tacatcgtg 49 154 50 DNA Artificial Sequence Synthetic DNA 154gctgaccttt gattacatcg tgacgttcaa cggccacaac ttcgatctgc 50 155 50 DNAArtificial Sequence Synthetic DNA 155 gccacaactt cgatctgcgg tatattaccaatcgtctcga gctgttgacc 50 156 50 DNA Artificial Sequence Synthetic DNA156 gtctcgagct gttgaccggc gaaaaaatca tctttcgtag ccccgacaag 50 157 50 DNAArtificial Sequence Synthetic DNA 157 ctttcgtagc cccgacaaga aagaagcggttcacctgtgc atctatgagc 50 158 46 DNA Artificial Sequence Synthetic DNA158 cacctgtgca tctatgagcg taatcagtcg agccacaaag gggttg 46 159 46 DNAArtificial Sequence Synthetic DNA 159 gagccacaaa ggggttggag ggatggcgaatacgaccttc cacgtc 46 160 50 DNA Artificial Sequence Synthetic DNA 160gaatacgacc ttccacgtca ataataataa tggcaccatt tttttcgacc 50 161 50 DNAArtificial Sequence Synthetic DNA 161 gaatacgacc ttccacgtca ataataataatggcaccatt tttttcgacc 50 162 49 DNA Artificial Sequence Synthetic DNA162 ggcaccattt ttttcgacct gtattctttc atccagaaat cggagaagc 49 163 49 DNAArtificial Sequence Synthetic DNA 163 catccagaaa tcggagaagc ttgattcttacaaactggac agcatcagc 49 164 50 DNA Artificial Sequence Synthetic DNA 164caaactggac agcatcagca aaaacgcctt ttcgtgcatg ggcaaagtgc 50 165 49 DNAArtificial Sequence Synthetic DNA 165 gtgcatgggc aaagtgctga atcgtggtgtgcgtgagatg acctttatc 49 166 48 DNA Artificial Sequence Synthetic DNA 166gtgcgtgaga tgacctttat cggtgatgat accactgatg cgaaaggg 48 167 47 DNAArtificial Sequence Synthetic DNA 167 ccactgatgc gaaagggaaa gcggctgtgtttgcgaaggt cctcacc 47 168 49 DNA Artificial Sequence Synthetic DNA 168gtttgcgaag gtcctcacca caggcaatta cgtgacggtc gatgatatc 49 169 50 DNAArtificial Sequence Synthetic DNA 169 cgtgacggtc gatgatatca tttgtaaagtgattcacaag gacatctggg 50 170 50 DNA Artificial Sequence Synthetic DNA170 gattcacaag gacatctggg aaaatggctt taaggtggtg ttgagctgtc 50 171 50 DNAArtificial Sequence Synthetic DNA 171 gattcacaag gacatctggg aaaatggctttaaggtggtg ttgagctgtc 50 172 49 DNA Artificial Sequence Synthetic DNA172 ggtggtgttg agctgtccga ctctgaccaa cgacacgtac aaactctcc 49 173 49 DNAArtificial Sequence Synthetic DNA 173 cgacacgtac aaactctcct ttggtaaagatgatgtcgac ctggcgcag 49 174 49 DNA Artificial Sequence Synthetic DNA 174gatgtcgacc tggcgcagat gtataaagac tataacctga acatcgccc 49 175 47 DNAArtificial Sequence Synthetic DNA 175 ctataacctg aacatcgccc ttgatatggcccgctattgc atccacg 47 176 49 DNA Artificial Sequence Synthetic DNA 176ccgctattgc atccacgacg cctgtctgtg ccaatacctg tgggagtac 49 177 50 DNAArtificial Sequence Synthetic DNA 177 gtgccaatac ctgtgggagt actatggtgtagagacgaaa acggatgcgg 50 178 46 DNA Artificial Sequence Synthetic DNA178 gacgaaaacg gatgcgggtg cctctaccta tgtgttgcct cagtcc 46 179 49 DNAArtificial Sequence Synthetic DNA 179 ctatgtgttg cctcagtcca tggtgtttgagtataaagcg agcacggtg 49 180 31 DNA Artificial Sequence Synthetic DNA 180gtataaagcg agcacggtga ttaaggggcc c 31 181 29 DNA Artificial SequenceSynthetic DNA 181 cggtgattaa ggggccctat ggctgccgc 29 182 47 DNAArtificial Sequence Synthetic DNA 182 cggtgattaa ggggccctac ccatacgatgttccggatta cgcttaa 47 183 30 DNA Artificial Sequence Synthetic DNA 183tcctcgagca tagggcccct gctgaaattg 30 184 48 DNA Artificial SequenceSynthetic DNA 184 gacgacgacg acaagcatat gctcgaggat gggcccctgc tgaaattg48 185 50 DNA Artificial Sequence Synthetic DNA 185 gggcccctgctgaaattgct gctggaaacc aagaccatct tagtgcgctc 50 186 50 DNA ArtificialSequence Synthetic DNA 186 gaccatctta gtgcgctctg aaaccaaaca aaagttcccctatgaaggcg 50 187 46 DNA Artificial Sequence Synthetic DNA 187gttcccctat gaaggcggta aagtttttgc cccgaagcag aagatg 46 188 50 DNAArtificial Sequence Synthetic DNA 188 ccccgaagca gaagatgttt agtaacaacgtcctgatctt tgactacaac 50 189 50 DNA Artificial Sequence Synthetic DNA189 cgtcctgatc tttgactaca actctctgta tcccaacgtg tgcatctttg 50 190 50 DNAArtificial Sequence Synthetic DNA 190 ccaacgtgtg catctttggc aacttaagtccggaaaccct ggttggcgtg 50 191 48 DNA Artificial Sequence Synthetic DNA191 gaaaccctgg ttggcgtggt ggtgtcttcg aaccgcttgg aagaagag 48 192 50 DNAArtificial Sequence Synthetic DNA 192 gaaccgcttg gaagaagaga ttaacaaccagctgctcctg caaaagtacc 50 193 49 DNA Artificial Sequence Synthetic DNA193 ctgctcctgc aaaagtaccc gccgccacgt tacattacgg tgcactgcg 49 194 50 DNAArtificial Sequence Synthetic DNA 194 cattacggtg cactgcgaac cacgtttacccaacctgatc agcgagattg 50 195 49 DNA Artificial Sequence Synthetic DNA195 caacctgatc agcgagattg ccatttttga tcggagcatt gaaggcacc 49 196 50 DNAArtificial Sequence Synthetic DNA 196 ggagcattga aggcaccatt ccgcgtttactgagaacctt tctggccgag 50 197 50 DNA Artificial Sequence Synthetic DNA197 ggagcattga aggcaccatt ccgcgtttac tgagaacctt tctggccgag 50 198 50 DNAArtificial Sequence Synthetic DNA 198 gaacctttct ggccgagcgt gcgcgttataagaaaatgct gaaacaggcg 50 199 49 DNA Artificial Sequence Synthetic DNA199 gaaaatgctg aaacaggcga ccagttctac ggaaaaagcc atctacgac 49 200 50 DNAArtificial Sequence Synthetic DNA 200 cggaaaaagc catctacgac agtatgcagtacacctacaa gatcatcgcg 50 201 49 DNA Artificial Sequence Synthetic DNA201 cacctacaag atcatcgcga atagtgtgta tggcttgatg ggttttcgc 49 202 50 DNAArtificial Sequence Synthetic DNA 202 gcttgatggg ttttcgcaac tctgccttgtatagctatgc cagcgctaag 50 203 50 DNA Artificial Sequence Synthetic DNA203 gctatgccag cgctaagagt tgtaccagta ttggccgtcg tatgatcctg 50 204 50 DNAArtificial Sequence Synthetic DNA 204 gccgtcgtat gatcctgtat ctggaatctgtactcaatgg agcggaactg 50 205 48 DNA Artificial Sequence Synthetic DNA205 ctcaatggag cggaactgag taatggcatg cttcgttttg caaacccg 48 206 48 DNAArtificial Sequence Synthetic DNA 206 cttcgttttg caaacccgtt aagtaatccgttctacatgg atgatcgc 48 207 48 DNA Artificial Sequence Synthetic DNA 207cgttctacat ggatgatcgc gacattaacc cgattgtgaa gacgtccc 48 208 47 DNAArtificial Sequence Synthetic DNA 208 cgattgtgaa gacgtccctg ccgattgactaccgttttcg cttcagg 47 209 47 DNA Artificial Sequence Synthetic DNA 209cgattgtgaa gacgtccctg ccgattgact accgttttcg cttcagg 47 210 46 DNAArtificial Sequence Synthetic DNA 210 ctaccgtttt cgcttcagga gtgtctatggtgataccgac tccgtg 46 211 47 DNA Artificial Sequence Synthetic DNA 211gtgataccga ctccgtgttt accgaaattg acagccagga tgttgac 47 212 46 DNAArtificial Sequence Synthetic DNA 212 gacagccagg atgttgacaa aagtattgagatagcgaagg agctgg 46 213 46 DNA Artificial Sequence Synthetic DNA 213gatagcgaag gagctggaac gtctgatcaa ctctcgtgtg ctgttc 46 214 47 DNAArtificial Sequence Synthetic DNA 214 caactctcgt gtgctgttca acaactttaagatcgagttt gaggccg 47 215 46 DNA Artificial Sequence Synthetic DNA 215gatcgagttt gaggccgtgt ataaaaacct gatcatgcag agcaag 46 216 49 DNAArtificial Sequence Synthetic DNA 216 cctgatcatg cagagcaaga aaaaatataccacgatgaag tatagcgcg 49 217 47 DNA Artificial Sequence Synthetic DNA 217cacgatgaag tatagcgcga gttctaactc caaaagtgtg ccggagc 47 218 44 DNAArtificial Sequence Synthetic DNA 218 caaaagtgtg ccggagcgta ttaacaaggggactagcgaa accc 44 219 44 DNA Artificial Sequence Synthetic DNA 219caaaagtgtg ccggagcgta ttaacaaggg gactagcgaa accc 44 220 49 DNAArtificial Sequence Synthetic DNA 220 ggggactagc gaaacccgtc gtgatgtcagcaagttccac aaaaacatg 49 221 50 DNA Artificial Sequence Synthetic DNA 221cagcaagttc cacaaaaaca tgattaaaat ttacaagacc cgtttgagcg 50 222 50 DNAArtificial Sequence Synthetic DNA 222 caagacccgt ttgagcgaaa tgttaagtgaaggccggatg aacagcaacc 50 223 44 DNA Artificial Sequence Synthetic DNA223 ccggatgaac agcaaccagg tgtgtatcga cattctgcgt tccc 44 224 46 DNAArtificial Sequence Synthetic DNA 224 cgacattctg cgttcccttg aaacggatcttcgtagcgag ttcgac 46 225 45 DNA Artificial Sequence Synthetic DNA 225cttcgtagcg agttcgacag ccgatctagc ccgttggaac tgttc 45 226 44 DNAArtificial Sequence Synthetic DNA 226 gcccgttgga actgttcatg ttaagccgcatgcaccactt gaac 44 227 48 DNA Artificial Sequence Synthetic DNA 227gcatgcacca cttgaactat aaaagcgccg ataacccgaa catgtacc 48 228 50 DNAArtificial Sequence Synthetic DNA 228 cgataacccg aacatgtacc tggtgaccgagtacaacaaa aacaacccgg 50 229 50 DNA Artificial Sequence Synthetic DNA229 cgataacccg aacatgtacc tggtgaccga gtacaacaaa aacaacccgg 50 230 49 DNAArtificial Sequence Synthetic DNA 230 gtacaacaaa aacaacccgg aaactattgaacttggcgaa cgctactac 49 231 50 DNA Artificial Sequence Synthetic DNA 231cttggcgaac gctactactt tgcctatatc tgtccggcga atgttccgtg 50 232 49 DNAArtificial Sequence Synthetic DNA 232 ccggcgaatg ttccgtggac caaaaaactcgtgaacatca agacgtacg 49 233 50 DNA Artificial Sequence Synthetic DNA 233cgtgaacatc aagacgtacg aaaccattat tgaccgttcc ttcaagctgg 50 234 46 DNAArtificial Sequence Synthetic DNA 234 ccgttccttc aagctgggct cagatcagcgcattttttac gaggtg 46 235 49 DNA Artificial Sequence Synthetic DNA 235gcgcattttt tacgaggtgt attttaaacg tctgacctcc gaaatcgtg 49 236 50 DNAArtificial Sequence Synthetic DNA 236 ctgacctccg aaatcgtgaa cctgttagataacaaggtgc tgtgcatttc 50 237 49 DNA Artificial Sequence Synthetic DNA237 caaggtgctg tgcatttctt tttttgaacg catgtttggc agcagaccg 49 238 34 DNAArtificial Sequence Synthetic DNA 238 catgtttggc agcagaccga ccttctatgaggcg 34 239 29 DNA Artificial Sequence Synthetic DNA 239 cgaccttctatgaggcgtat ggctgccgc 29 240 47 DNA Artificial Sequence Synthetic DNA 240cgaccttcta tgaggcgtac ccatacgatg ttccggatta cgcttaa 47 241 48 DNAArtificial Sequence Synthetic DNA 241 atggatgtgc gttgcatcaa ttggtttgaatcgcatggtg aaaacagg 48 242 50 DNA Artificial Sequence Synthetic DNA 242cccattacga cagcggcttt tcagatataa aaacctgttt tcaccatgcg 50 243 47 DNAArtificial Sequence Synthetic DNA 243 gccgctgtcg taatggggaa actgtgttcattcgcttccc tcactac 47 244 50 DNA Artificial Sequence Synthetic DNA 244ggtagatctc atcggtcacc acatagtaaa agtagtgagg gaagcgaatg 50 245 50 DNAArtificial Sequence Synthetic DNA 245 ggtgaccgat gagatctacc agagcttagcccccccacct ttcaacgctc 50 246 50 DNA Artificial Sequence Synthetic DNA246 cgatgtcaat ggtccgcatt ttacccatag gacgagcgtt gaaaggtggg 50 247 50 DNAArtificial Sequence Synthetic DNA 247 gcggaccatt gacatcgatg agaccatctcgtacaacctg gacatcaagg 50 248 49 DNA Artificial Sequence Synthetic DNA248 cacatgtccg ccacagagca tttacgatcc ttgatgtcca ggttgtacg 49 249 46 DNAArtificial Sequence Synthetic DNA 249 ctctgtggcg gacatgtggt taattgaagagccgaaaaag cgcaac 46 250 48 DNA Artificial Sequence Synthetic DNA 250cagaaactca tccatggtgg cattctgaat gttgcgcttt ttcggctc 48 251 50 DNAArtificial Sequence Synthetic DNA 251 ccaccatgga tgagtttctg aatatttcttggttctacat cagcaacggc 50 252 50 DNA Artificial Sequence Synthetic DNA252 gtccaagctg tagcatccat ccggagaaat gccgttgctg atgtagaacc 50 253 50 DNAArtificial Sequence Synthetic DNA 253 ggttctacat cagcaacggc atttctccggatggatgcta cagcttggac 50 254 50 DNA Artificial Sequence Synthetic DNA254 cacccgttgt tgattttcgt gagatactga tcgtccaagc tgtagcatcc 50 255 48 DNAArtificial Sequence Synthetic DNA 255 cacgaaaatc aacaacgggt gctatcattgtggcgaccct cgtaactg 48 256 45 DNA Artificial Sequence Synthetic DNA 256gtcaaaacgg gggatctctt tcgcaaaaca gttacgaggg tcgcc 45 257 49 DNAArtificial Sequence Synthetic DNA 257 gagatccccc gttttgacat tccgagaagctatctgttcc tggacattg 49 258 50 DNA Artificial Sequence Synthetic DNA 258gctcgggaac ttcttatcga aatggcattc aatgtccagg aacagatagc 50 259 49 DNAArtificial Sequence Synthetic DNA 259 cgataagaag ttcccgagcg tttttattaatccgatcagc catacctcc 49 260 50 DNA Artificial Sequence Synthetic DNA 260cagacgtttg ccgctcagat caatataaca ataggaggta tggctgatcg 50 261 50 DNAArtificial Sequence Synthetic DNA 261 ctgagcggca aacgtctgct gtttaccctgatcaacgagg agatgctgac 50 262 50 DNA Artificial Sequence Synthetic DNA262 gatccacggc ctcctggatt tcttgttcgg tcagcatctc ctcgttgatc 50 263 49 DNAArtificial Sequence Synthetic DNA 263 ccaggaggcc gtggatcgtg gctgtctgcgcattcagtcc ttgatggag 49 264 50 DNA Artificial Sequence Synthetic DNA 264gcacagcacc agttcacgtt cataatccat ctccatcaag gactgaatgc 50 265 50 DNAArtificial Sequence Synthetic DNA 265 gcattcagtc cttgatggag atggattatgaacgtgaact ggtgctgtgc 50 266 48 DNA Artificial Sequence Synthetic DNA266 gtttggcgat ttggagcagc acaatttcag agcacagcac cagttcac 48 267 49 DNAArtificial Sequence Synthetic DNA 267 gctccaaatc gccaaacagt tattagagctgacctttgat tacatcgtg 49 268 50 DNA Artificial Sequence Synthetic DNA 268gcagatcgaa gttgtggccg ttgaacgtca cgatgtaatc aaaggtcagc 50 269 50 DNAArtificial Sequence Synthetic DNA 269 gccacaactt cgatctgcgg tatattaccaatcgtctcga gctgttgacc 50 270 50 DNA Artificial Sequence Synthetic DNA270 cttgtcgggg ctacgaaaga tgattttttc gccggtcaac agctcgagac 50 271 50 DNAArtificial Sequence Synthetic DNA 271 ctttcgtagc cccgacaaga aagaagcggttcacctgtgc atctatgagc 50 272 46 DNA Artificial Sequence Synthetic DNA272 caaccccttt gtggctcgac tgattacgct catagatgca caggtg 46 273 46 DNAArtificial Sequence Synthetic DNA 273 gagccacaaa ggggttggag ggatggcgaatacgaccttc cacgtc 46 274 50 DNA Artificial Sequence Synthetic DNA 274ggtcgaaaaa aatggtgcca ttattattat tgacgtggaa ggtcgtattc 50 275 50 DNAArtificial Sequence Synthetic DNA 275 gaatacgacc ttccacgtca ataataataatggcaccatt tttttcgacc 50 276 49 DNA Artificial Sequence Synthetic DNA276 gcttctccga tttctggatg aaagaataca ggtcgaaaaa aatggtgcc 49 277 49 DNAArtificial Sequence Synthetic DNA 277 catccagaaa tcggagaagc ttgattcttacaaactggac agcatcagc 49 278 50 DNA Artificial Sequence Synthetic DNA 278gcactttgcc catgcacgaa aaggcgtttt tgctgatgct gtccagtttg 50 279 49 DNAArtificial Sequence Synthetic DNA 279 gtgcatgggc aaagtgctga atcgtggtgtgcgtgagatg acctttatc 49 280 48 DNA Artificial Sequence Synthetic DNA 280ccctttcgca tcagtggtat catcaccgat aaaggtcatc tcacgcac 48 281 47 DNAArtificial Sequence Synthetic DNA 281 ccactgatgc gaaagggaaa gcggctgtgtttgcgaaggt cctcacc 47 282 49 DNA Artificial Sequence Synthetic DNA 282gatatcatcg accgtcacgt aattgcctgt ggtgaggacc ttcgcaaac 49 283 50 DNAArtificial Sequence Synthetic DNA 283 cgtgacggtc gatgatatca tttgtaaagtgattcacaag gacatctggg 50 284 50 DNA Artificial Sequence Synthetic DNA284 gacagctcaa caccacctta aagccatttt cccagatgtc cttgtgaatc 50 285 50 DNAArtificial Sequence Synthetic DNA 285 gattcacaag gacatctggg aaaatggctttaaggtggtg ttgagctgtc 50 286 49 DNA Artificial Sequence Synthetic DNA286 ggagagtttg tacgtgtcgt tggtcagagt cggacagctc aacaccacc 49 287 49 DNAArtificial Sequence Synthetic DNA 287 cgacacgtac aaactctcct ttggtaaagatgatgtcgac ctggcgcag 49 288 49 DNA Artificial Sequence Synthetic DNA 288gggcgatgtt caggttatag tctttataca tctgcgccag gtcgacatc 49 289 47 DNAArtificial Sequence Synthetic DNA 289 ctataacctg aacatcgccc ttgatatggcccgctattgc atccacg 47 290 49 DNA Artificial Sequence Synthetic DNA 290gtactcccac aggtattggc acagacaggc gtcgtggatg caatagcgg 49 291 50 DNAArtificial Sequence Synthetic DNA 291 gtgccaatac ctgtgggagt actatggtgtagagacgaaa acggatgcgg 50 292 46 DNA Artificial Sequence Synthetic DNA292 ggactgaggc aacacatagg tagaggcacc cgcatccgtt ttcgtc 46 293 49 DNAArtificial Sequence Synthetic DNA 293 ctatgtgttg cctcagtcca tggtgtttgagtataaagcg agcacggtg 49 294 31 DNA Artificial Sequence Synthetic DNA 294gggcccctta atcaccgtgc tcgctttata c 31 295 50 DNA Artificial SequenceSynthetic DNA 295 gggcccctgc tgaaattgct gctggaaacc aagaccatct tagtgcgctc50 296 50 DNA Artificial Sequence Synthetic DNA 296 cgccttcataggggaacttt tgtttggttt cagagcgcac taagatggtc 50 297 46 DNA ArtificialSequence Synthetic DNA 297 gttcccctat gaaggcggta aagtttttgc cccgaagcagaagatg 46 298 50 DNA Artificial Sequence Synthetic DNA 298 gttgtagtcaaagatcagga cgttgttact aaacatcttc tgcttcgggg 50 299 50 DNA ArtificialSequence Synthetic DNA 299 cgtcctgatc tttgactaca actctctgta tcccaacgtgtgcatctttg 50 300 50 DNA Artificial Sequence Synthetic DNA 300cacgccaacc agggtttccg gacttaagtt gccaaagatg cacacgttgg 50 301 48 DNAArtificial Sequence Synthetic DNA 301 gaaaccctgg ttggcgtggt ggtgtcttcgaaccgcttgg aagaagag 48 302 50 DNA Artificial Sequence Synthetic DNA 302ggtacttttg caggagcagc tggttgttaa tctcttcttc caagcggttc 50 303 49 DNAArtificial Sequence Synthetic DNA 303 ctgctcctgc aaaagtaccc gccgccacgttacattacgg tgcactgcg 49 304 50 DNA Artificial Sequence Synthetic DNA 304caatctcgct gatcaggttg ggtaaacgtg gttcgcagtg caccgtaatg 50 305 49 DNAArtificial Sequence Synthetic DNA 305 caacctgatc agcgagattg ccatttttgatcggagcatt gaaggcacc 49 306 50 DNA Artificial Sequence Synthetic DNA 306ctcggccaga aaggttctca gtaaacgcgg aatggtgcct tcaatgctcc 50 307 50 DNAArtificial Sequence Synthetic DNA 307 ggagcattga aggcaccatt ccgcgtttactgagaacctt tctggccgag 50 308 50 DNA Artificial Sequence Synthetic DNA308 cgcctgtttc agcattttct tataacgcgc acgctcggcc agaaaggttc 50 309 49 DNAArtificial Sequence Synthetic DNA 309 gaaaatgctg aaacaggcga ccagttctacggaaaaagcc atctacgac 49 310 50 DNA Artificial Sequence Synthetic DNA 310cgcgatgatc ttgtaggtgt actgcatact gtcgtagatg gctttttccg 50 311 49 DNAArtificial Sequence Synthetic DNA 311 cacctacaag atcatcgcga atagtgtgtatggcttgatg ggttttcgc 49 312 50 DNA Artificial Sequence Synthetic DNA 312cttagcgctg gcatagctat acaaggcaga gttgcgaaaa cccatcaagc 50 313 50 DNAArtificial Sequence Synthetic DNA 313 gctatgccag cgctaagagt tgtaccagtattggccgtcg tatgatcctg 50 314 50 DNA Artificial Sequence Synthetic DNA314 cagttccgct ccattgagta cagattccag atacaggatc atacgacggc 50 315 48 DNAArtificial Sequence Synthetic DNA 315 ctcaatggag cggaactgag taatggcatgcttcgttttg caaacccg 48 316 48 DNA Artificial Sequence Synthetic DNA 316gcgatcatcc atgtagaacg gattacttaa cgggtttgca aaacgaag 48 317 48 DNAArtificial Sequence Synthetic DNA 317 cgttctacat ggatgatcgc gacattaacccgattgtgaa gacgtccc 48 318 47 DNA Artificial Sequence Synthetic DNA 318cctgaagcga aaacggtagt caatcggcag ggacgtcttc acaatcg 47 319 47 DNAArtificial Sequence Synthetic DNA 319 cgattgtgaa gacgtccctg ccgattgactaccgttttcg cttcagg 47 320 46 DNA Artificial Sequence Synthetic DNA 320cacggagtcg gtatcaccat agacactcct gaagcgaaaa cggtag 46 321 47 DNAArtificial Sequence Synthetic DNA 321 gtgataccga ctccgtgttt accgaaattgacagccagga tgttgac 47 322 46 DNA Artificial Sequence Synthetic DNA 322ccagctcctt cgctatctca atacttttgt caacatcctg gctgtc 46 323 46 DNAArtificial Sequence Synthetic DNA 323 gatagcgaag gagctggaac gtctgatcaactctcgtgtg ctgttc 46 324 47 DNA Artificial Sequence Synthetic DNA 324cggcctcaaa ctcgatctta aagttgttga acagcacacg agagttg 47 325 46 DNAArtificial Sequence Synthetic DNA 325 gatcgagttt gaggccgtgt ataaaaacctgatcatgcag agcaag 46 326 49 DNA Artificial Sequence Synthetic DNA 326cgcgctatac ttcatcgtgg tatatttttt cttgctctgc atgatcagg 49 327 47 DNAArtificial Sequence Synthetic DNA 327 cacgatgaag tatagcgcga gttctaactccaaaagtgtg ccggagc 47 328 44 DNA Artificial Sequence Synthetic DNA 328gggtttcgct agtccccttg ttaatacgct ccggcacact tttg 44 329 44 DNAArtificial Sequence Synthetic DNA 329 caaaagtgtg ccggagcgta ttaacaaggggactagcgaa accc 44 330 49 DNA Artificial Sequence Synthetic DNA 330catgtttttg tggaacttgc tgacatcacg acgggtttcg ctagtcccc 49 331 50 DNAArtificial Sequence Synthetic DNA 331 cagcaagttc cacaaaaaca tgattaaaatttacaagacc cgtttgagcg 50 332 50 DNA Artificial Sequence Synthetic DNA332 ggttgctgtt catccggcct tcacttaaca tttcgctcaa acgggtcttg 50 333 44 DNAArtificial Sequence Synthetic DNA 333 ccggatgaac agcaaccagg tgtgtatcgacattctgcgt tccc 44 334 46 DNA Artificial Sequence Synthetic DNA 334gtcgaactcg ctacgaagat ccgtttcaag ggaacgcaga atgtcg 46 335 45 DNAArtificial Sequence Synthetic DNA 335 cttcgtagcg agttcgacag ccgatctagcccgttggaac tgttc 45 336 44 DNA Artificial Sequence Synthetic DNA 336gttcaagtgg tgcatgcggc ttaacatgaa cagttccaac gggc 44 337 48 DNAArtificial Sequence Synthetic DNA 337 gcatgcacca cttgaactat aaaagcgccgataacccgaa catgtacc 48 338 50 DNA Artificial Sequence Synthetic DNA 338ccgggttgtt tttgttgtac tcggtcacca ggtacatgtt cgggttatcg 50 339 50 DNAArtificial Sequence Synthetic DNA 339 cgataacccg aacatgtacc tggtgaccgagtacaacaaa aacaacccgg 50 340 49 DNA Artificial Sequence Synthetic DNA340 gtagtagcgt tcgccaagtt caatagtttc cgggttgttt ttgttgtac 49 341 50 DNAArtificial Sequence Synthetic DNA 341 cttggcgaac gctactactt tgcctatatctgtccggcga atgttccgtg 50 342 49 DNA Artificial Sequence Synthetic DNA342 cgtacgtctt gatgttcacg agttttttgg tccacggaac attcgccgg 49 343 50 DNAArtificial Sequence Synthetic DNA 343 cgtgaacatc aagacgtacg aaaccattattgaccgttcc ttcaagctgg 50 344 46 DNA Artificial Sequence Synthetic DNA344 cacctcgtaa aaaatgcgct gatctgagcc cagcttgaag gaacgg 46 345 49 DNAArtificial Sequence Synthetic DNA 345 gcgcattttt tacgaggtgt attttaaacgtctgacctcc gaaatcgtg 49 346 50 DNA Artificial Sequence Synthetic DNA 346gaaatgcaca gcaccttgtt atctaacagg ttcacgattt cggaggtcag 50 347 49 DNAArtificial Sequence Synthetic DNA 347 caaggtgctg tgcatttctt tttttgaacgcatgtttggc agcagaccg 49 348 34 DNA Artificial Sequence Synthetic DNA 348cgcctcatag aaggtcggtc tgctgccaaa catg 34 349 29 DNA Artificial SequenceSynthetic DNA 349 gcggcagcca tagggcccct taatcaccg 29 350 47 DNAArtificial Sequence Synthetic DNA 350 ttaagcgtaa tccggaacat cgtatgggtagggcccctta atcaccg 47 351 29 DNA Artificial Sequence Synthetic DNA 351gcggcagcca tacgcctcat agaaggtcg 29 352 47 DNA Artificial SequenceSynthetic DNA 352 ttaagcgtaa tccggaacat cgtatgggta cgcctcatag aaggtcg 47353 1542 DNA Artificial Sequence Synthetic DNA 353 atggcggatt ctcagccgttgtctggtgcc ccggaaggcg cggagtatct tcgggcggta 60 ctgcgagctc cagtgtatgaagccgcacag gtgaccccgc tgcaaaaaat ggagaaactg 120 agctcccgtc tggataacgtcatcctggtc aaacgcgaag atcgtcagcc ggttcacagc 180 ttcaaactgc gcggggcctatgctatgatg gcgggcttaa cggaagagca gaaagcacat 240 ggcgttatta ccgcgtctgcaggcaaccat gcacaggggg tagcgttttc tagtgcgcgt 300 ttaggcgtga aagcgttgatcgtgatgccg accgcaaccg ctgatatcaa agtcgatgcc 360 gtgcgtggct ttggcggagaagtgctgtta cacggcgcaa acttcgatga agcaaaagcc 420 aaagcgatcg aactgtctcagcagcagggg tttacctggg ttccgccgtt cgatcatccg 480 atggtgattg ctggtcagggcactctggcg ttagaactgc tccagcaaga tgcccacctg 540 gatcgtgtgt ttgtgccggtaggtggagga ggccttgctg caggagtcgc agtgctgatc 600 aaacagttga tgccgcagatcaaggttatt gccgtggaag ccgaggatag cgcctgtctg 660 aaagcggcgt tagatgctggtcatccggtc gatctgccac gtgtaggcct gtttgcggaa 720 ggtgtagcgg tgaagcgtattggcgacgaa acctttcggc tgtgccagga atatctcgac 780 gacatcatca ccgttgacagcgatgcgatt tgtgccgcga tgaaagacct gttcgaagat 840 gtgcgtgcgg tagctgaaccaagtggtgca ttagcgttgg ccggcatgaa aaagtacatt 900 gccttgcaca acattcgtggcgaacgcctg gcccatatcc tgagtggagc caacgtcaac 960 ttccatggcc tgcgttacgttagcgaacgg tgtgaactgg gcgagcaacg tgaagcgctc 1020 ctggccgtta ccatcccagaggagaagggc agctttctga aattttgcca gctgttaggg 1080 ggccgtagcg tcaccgaattcaactaccgc tttgccgatg cgaaaaatgc gtgcattttc 1140 gtgggcgtcc gtctgtcccgtggcctggaa gagcgcaagg agattctgca gatgctgaac 1200 gatggtggtt attccgtggtggatctcagc gacgatgaaa tggccaagct gcatgtgcgc 1260 tatatggtgg ggggtcgtccgagtcacccg ttgcaggaac gcttgtacag cttcgagttt 1320 ccggagtctc ctggtgcactgttacgcttc ctgaacaccc tggggacgta ctggaacatc 1380 agcctgtttc actatcgctcccatggtact gactacggtc gggtcctggc tgcgtttgaa 1440 ctgggggacc acgaaccggacttcgaaacc cgcctgaacg aattaggcta cgactgccat 1500 gacgaaacca acaacccggcgtttcgcttt tttctggcag gg 1542 354 33 DNA Artificial Sequence SyntheticDNA 354 atggcggatt ctcagccgtt gtctggtgcc ccg 33 355 60 DNA ArtificialSequence Synthetic DNA 355 atggcggatt ctcagccgtt gtctggtgcc ccggaaggcgcggagtatct tcgggcggta 60 356 56 DNA Artificial Sequence Synthetic DNA356 gaaggcgcgg agtatcttcg ggcggtactg cgagctccag tgtatgaagc cgcaca 56 35760 DNA Artificial Sequence Synthetic DNA 357 ctgcgagctc cagtgtatgaagccgcacag gtgaccccgc tgcaaaaaat ggagaaactg 60 358 60 DNA ArtificialSequence Synthetic DNA 358 ggtgaccccg ctgcaaaaaa tggagaaact gagctcccgtctggataacg tcatcctggt 60 359 60 DNA Artificial Sequence Synthetic DNA359 agctcccgtc tggataacgt catcctggtc aaacgcgaag atcgtcagcc ggttcacagc 60360 60 DNA Artificial Sequence Synthetic DNA 360 caaacgcgaa gatcgtcagccggttcacag cttcaaactg cgcggggcct atgctatgat 60 361 60 DNA ArtificialSequence Synthetic DNA 361 ttcaaactgc gcggggccta tgctatgatg gcgggcttaacggaagagca gaaagcacat 60 362 60 DNA Artificial Sequence Synthetic DNA362 ggcgggctta acggaagagc agaaagcaca tggcgttatt accgcgtctg caggcaacca 60363 60 DNA Artificial Sequence Synthetic DNA 363 ggcgttatta ccgcgtctgcaggcaaccat gcacaggggg tagcgttttc tagtgcgcgt 60 364 60 DNA ArtificialSequence Synthetic DNA 364 tgcacagggg gtagcgtttt ctagtgcgcg tttaggcgtgaaagcgttga tcgtgatgcc 60 365 60 DNA Artificial Sequence Synthetic DNA365 ttaggcgtga aagcgttgat cgtgatgccg accgcaaccg ctgatatcaa agtcgatgcc 60366 60 DNA Artificial Sequence Synthetic DNA 366 gaccgcaacc gctgatatcaaagtcgatgc cgtgcgtggc tttggcggag aagtgctgtt 60 367 60 DNA ArtificialSequence Synthetic DNA 367 gtgcgtggct ttggcggaga agtgctgtta cacggcgcaaacttcgatga agcaaaagcc 60 368 60 DNA Artificial Sequence Synthetic DNA368 acacggcgca aacttcgatg aagcaaaagc caaagcgatc gaactgtctc agcagcaggg 60369 60 DNA Artificial Sequence Synthetic DNA 369 aaagcgatcg aactgtctcagcagcagggg tttacctggg ttccgccgtt cgatcatccg 60 370 60 DNA ArtificialSequence Synthetic DNA 370 gtttacctgg gttccgccgt tcgatcatcc gatggtgattgctggtcagg gcactctggc 60 371 58 DNA Artificial Sequence Synthetic DNA371 atggtgattg ctggtcaggg cactctggcg ttagaactgc tccagcaaga tgcccacc 58372 56 DNA Artificial Sequence Synthetic DNA 372 gttagaactg ctccagcaagatgcccacct ggatcgtgtg tttgtgccgg taggtg 56 373 54 DNA ArtificialSequence Synthetic DNA 373 tggatcgtgt gtttgtgccg gtaggtggag gaggccttgctgcaggagtc gcag 54 374 60 DNA Artificial Sequence Synthetic DNA 374gaggaggcct tgctgcagga gtcgcagtgc tgatcaaaca gttgatgccg cagatcaagg 60 37560 DNA Artificial Sequence Synthetic DNA 375 tgctgatcaa acagttgatgccgcagatca aggttattgc cgtggaagcc gaggatagcg 60 376 60 DNA ArtificialSequence Synthetic DNA 376 ttattgccgt ggaagccgag gatagcgcct gtctgaaagcggcgttagat gctggtcatc 60 377 60 DNA Artificial Sequence Synthetic DNA377 cctgtctgaa agcggcgtta gatgctggtc atccggtcga tctgccacgt gtaggcctgt 60378 60 DNA Artificial Sequence Synthetic DNA 378 cggtcgatct gccacgtgtaggcctgtttg cggaaggtgt agcggtgaag cgtattggcg 60 379 60 DNA ArtificialSequence Synthetic DNA 379 ttgcggaagg tgtagcggtg aagcgtattg gcgacgaaacctttcggctg tgccaggaat 60 380 59 DNA Artificial Sequence Synthetic DNA380 acgaaacctt tcggctgtgc caggaatatc tcgacgacat catcaccgtt gacagcgat 59381 60 DNA Artificial Sequence Synthetic DNA 381 atctcgacga catcatcaccgttgacagcg atgcgatttg tgccgcgatg aaagacctgt 60 382 60 DNA ArtificialSequence Synthetic DNA 382 gcgatttgtg ccgcgatgaa agacctgttc gaagatgtgcgtgcggtagc tgaaccaagt 60 383 60 DNA Artificial Sequence Synthetic DNA383 tcgaagatgt gcgtgcggta gctgaaccaa gtggtgcatt agcgttggcc ggcatgaaaa 60384 60 DNA Artificial Sequence Synthetic DNA 384 ggtgcattag cgttggccggcatgaaaaag tacattgcct tgcacaacat tcgtggcgaa 60 385 60 DNA ArtificialSequence Synthetic DNA 385 agtacattgc cttgcacaac attcgtggcg aacgcctggcccatatcctg agtggagcca 60 386 55 DNA Artificial Sequence Synthetic DNA386 cgcctggccc atatcctgag tggagccaac gtcaacttcc atggcctgcg ttacg 55 38754 DNA Artificial Sequence Synthetic DNA 387 acgtcaactt ccatggcctgcgttacgtta gcgaacggtg tgaactgggc gagc 54 388 54 DNA Artificial SequenceSynthetic DNA 388 ttagcgaacg gtgtgaactg ggcgagcaac gtgaagcgct cctggccgttacca 54 389 54 DNA Artificial Sequence Synthetic DNA 389 aacgtgaagcgctcctggcc gttaccatcc cagaggagaa gggcagcttt ctga 54 390 54 DNAArtificial Sequence Synthetic DNA 390 tcccagagga gaagggcagc tttctgaaattttgccagct gttagggggc cgta 54 391 59 DNA Artificial Sequence SyntheticDNA 391 aattttgcca gctgttaggg ggccgtagcg tcaccgaatt caactaccgc tttgccgat59 392 60 DNA Artificial Sequence Synthetic DNA 392 gcgtcaccgaattcaactac cgctttgccg atgcgaaaaa tgcgtgcatt ttcgtgggcg 60 393 55 DNAArtificial Sequence Synthetic DNA 393 gcgaaaaatg cgtgcatttt cgtgggcgtccgtctgtccc gtggcctgga agagc 55 394 59 DNA Artificial Sequence SyntheticDNA 394 tccgtctgtc ccgtggcctg gaagagcgca aggagattct gcagatgctg aacgatggt59 395 60 DNA Artificial Sequence Synthetic DNA 395 gcaaggagattctgcagatg ctgaacgatg gtggttattc cgtggtggat ctcagcgacg 60 396 55 DNAArtificial Sequence Synthetic DNA 396 ggttattccg tggtggatct cagcgacgatgaaatggcca agctgcatgt gcgct 55 397 54 DNA Artificial Sequence SyntheticDNA 397 atgaaatggc caagctgcat gtgcgctata tggtgggggg tcgtccgagt cacc 54398 55 DNA Artificial Sequence Synthetic DNA 398 atatggtggg gggtcgtccgagtcacccgt tgcaggaacg cttgtacagc ttcga 55 399 55 DNA Artificial SequenceSynthetic DNA 399 cgttgcagga acgcttgtac agcttcgagt ttccggagtc tcctggtgcactgtt 55 400 54 DNA Artificial Sequence Synthetic DNA 400 gtttccggagtctcctggtg cactgttacg cttcctgaac accctgggga cgta 54 401 59 DNAArtificial Sequence Synthetic DNA 401 acgcttcctg aacaccctgg ggacgtactggaacatcagc ctgtttcact atcgctccc 59 402 59 DNA Artificial SequenceSynthetic DNA 402 ctggaacatc agcctgtttc actatcgctc ccatggtact gactacggtcgggtcctgg 59 403 54 DNA Artificial Sequence Synthetic DNA 403 atggtactgactacggtcgg gtcctggctg cgtttgaact gggggaccac gaac 54 404 55 DNAArtificial Sequence Synthetic DNA 404 ctgcgtttga actgggggac cacgaaccggacttcgaaac ccgcctgaac gaatt 55 405 59 DNA Artificial Sequence SyntheticDNA 405 cggacttcga aacccgcctg aacgaattag gctacgactg ccatgacgaa accaacaac59 406 58 DNA Artificial Sequence Synthetic DNA 406 aggctacgactgccatgacg aaaccaacaa cccggcgttt cgcttttttc tggcaggg 58 407 27 DNAArtificial Sequence Synthetic DNA 407 ccggcgtttc gcttttttct ggcaggg 27408 45 DNA Artificial Sequence Synthetic DNA 408 tcctcgagca taatggcggattctcagccg ttgtctggtg ccccg 45 409 12 DNA Artificial Sequence SyntheticDNA 409 tatgctcgag ga 12 410 63 DNA Artificial Sequence Synthetic DNA410 gacgacgacg acaagcatat gctcgaggat atggcggatt ctcagccgtt gtctggtgcc 60ccg 63 411 30 DNA Artificial Sequence Synthetic DNA 411 atcctcgagcatatgcttgt cgtcgtcgtc 30 412 33 DNA Artificial Sequence Synthetic DNA412 atggcggatt ctcagccgtt gtctggtgcc ccg 33 413 60 DNA ArtificialSequence Synthetic DNA 413 taccgcccga agatactccg cgccttccgg ggcaccagacaacggctgag aatccgccat 60 414 56 DNA Artificial Sequence Synthetic DNA414 gaaggcgcgg agtatcttcg ggcggtactg cgagctccag tgtatgaagc cgcaca 56 41560 DNA Artificial Sequence Synthetic DNA 415 cagtttctcc attttttgcagcggggtcac ctgtgcggct tcatacactg gagctcgcag 60 416 60 DNA ArtificialSequence Synthetic DNA 416 ggtgaccccg ctgcaaaaaa tggagaaact gagctcccgtctggataacg tcatcctggt 60 417 60 DNA Artificial Sequence Synthetic DNA417 gctgtgaacc ggctgacgat cttcgcgttt gaccaggatg acgttatcca gacgggagct 60418 60 DNA Artificial Sequence Synthetic DNA 418 caaacgcgaa gatcgtcagccggttcacag cttcaaactg cgcggggcct atgctatgat 60 419 60 DNA ArtificialSequence Synthetic DNA 419 atgtgctttc tgctcttccg ttaagcccgc catcatagcataggccccgc gcagtttgaa 60 420 60 DNA Artificial Sequence Synthetic DNA420 ggcgggctta acggaagagc agaaagcaca tggcgttatt accgcgtctg caggcaacca 60421 60 DNA Artificial Sequence Synthetic DNA 421 acgcgcacta gaaaacgctaccccctgtgc atggttgcct gcagacgcgg taataacgcc 60 422 60 DNA ArtificialSequence Synthetic DNA 422 tgcacagggg gtagcgtttt ctagtgcgcg tttaggcgtgaaagcgttga tcgtgatgcc 60 423 60 DNA Artificial Sequence Synthetic DNA423 ggcatcgact ttgatatcag cggttgcggt cggcatcacg atcaacgctt tcacgcctaa 60424 60 DNA Artificial Sequence Synthetic DNA 424 gaccgcaacc gctgatatcaaagtcgatgc cgtgcgtggc tttggcggag aagtgctgtt 60 425 60 DNA ArtificialSequence Synthetic DNA 425 ggcttttgct tcatcgaagt ttgcgccgtg taacagcacttctccgccaa agccacgcac 60 426 60 DNA Artificial Sequence Synthetic DNA426 acacggcgca aacttcgatg aagcaaaagc caaagcgatc gaactgtctc agcagcaggg 60427 60 DNA Artificial Sequence Synthetic DNA 427 cggatgatcg aacggcggaacccaggtaaa cccctgctgc tgagacagtt cgatcgcttt 60 428 60 DNA ArtificialSequence Synthetic DNA 428 gtttacctgg gttccgccgt tcgatcatcc gatggtgattgctggtcagg gcactctggc 60 429 58 DNA Artificial Sequence Synthetic DNA429 ggtgggcatc ttgctggagc agttctaacg ccagagtgcc ctgaccagca atcaccat 58430 56 DNA Artificial Sequence Synthetic DNA 430 gttagaactg ctccagcaagatgcccacct ggatcgtgtg tttgtgccgg taggtg 56 431 54 DNA ArtificialSequence Synthetic DNA 431 ctgcgactcc tgcagcaagg cctcctccac ctaccggcacaaacacacga tcca 54 432 60 DNA Artificial Sequence Synthetic DNA 432gaggaggcct tgctgcagga gtcgcagtgc tgatcaaaca gttgatgccg cagatcaagg 60 43360 DNA Artificial Sequence Synthetic DNA 433 cgctatcctc ggcttccacggcaataacct tgatctgcgg catcaactgt ttgatcagca 60 434 60 DNA ArtificialSequence Synthetic DNA 434 ttattgccgt ggaagccgag gatagcgcct gtctgaaagcggcgttagat gctggtcatc 60 435 60 DNA Artificial Sequence Synthetic DNA435 acaggcctac acgtggcaga tcgaccggat gaccagcatc taacgccgct ttcagacagg 60436 60 DNA Artificial Sequence Synthetic DNA 436 cggtcgatct gccacgtgtaggcctgtttg cggaaggtgt agcggtgaag cgtattggcg 60 437 60 DNA ArtificialSequence Synthetic DNA 437 attcctggca cagccgaaag gtttcgtcgc caatacgcttcaccgctaca ccttccgcaa 60 438 59 DNA Artificial Sequence Synthetic DNA438 acgaaacctt tcggctgtgc caggaatatc tcgacgacat catcaccgtt gacagcgat 59439 60 DNA Artificial Sequence Synthetic DNA 439 acaggtcttt catcgcggcacaaatcgcat cgctgtcaac ggtgatgatg tcgtcgagat 60 440 60 DNA ArtificialSequence Synthetic DNA 440 gcgatttgtg ccgcgatgaa agacctgttc gaagatgtgcgtgcggtagc tgaaccaagt 60 441 60 DNA Artificial Sequence Synthetic DNA441 ttttcatgcc ggccaacgct aatgcaccac ttggttcagc taccgcacgc acatcttcga 60442 60 DNA Artificial Sequence Synthetic DNA 442 ggtgcattag cgttggccggcatgaaaaag tacattgcct tgcacaacat tcgtggcgaa 60 443 60 DNA ArtificialSequence Synthetic DNA 443 tggctccact caggatatgg gccaggcgtt cgccacgaatgttgtgcaag gcaatgtact 60 444 55 DNA Artificial Sequence Synthetic DNA444 cgcctggccc atatcctgag tggagccaac gtcaacttcc atggcctgcg ttacg 55 44554 DNA Artificial Sequence Synthetic DNA 445 gctcgcccag ttcacaccgttcgctaacgt aacgcaggcc atggaagttg acgt 54 446 54 DNA Artificial SequenceSynthetic DNA 446 ttagcgaacg gtgtgaactg ggcgagcaac gtgaagcgct cctggccgttacca 54 447 54 DNA Artificial Sequence Synthetic DNA 447 tcagaaagctgcccttctcc tctgggatgg taacggccag gagcgcttca cgtt 54 448 54 DNAArtificial Sequence Synthetic DNA 448 tcccagagga gaagggcagc tttctgaaattttgccagct gttagggggc cgta 54 449 59 DNA Artificial Sequence SyntheticDNA 449 atcggcaaag cggtagttga attcggtgac gctacggccc cctaacagct ggcaaaatt59 450 60 DNA Artificial Sequence Synthetic DNA 450 gcgtcaccgaattcaactac cgctttgccg atgcgaaaaa tgcgtgcatt ttcgtgggcg 60 451 55 DNAArtificial Sequence Synthetic DNA 451 gctcttccag gccacgggac agacggacgcccacgaaaat gcacgcattt ttcgc 55 452 59 DNA Artificial Sequence SyntheticDNA 452 tccgtctgtc ccgtggcctg gaagagcgca aggagattct gcagatgctg aacgatggt59 453 60 DNA Artificial Sequence Synthetic DNA 453 cgtcgctgagatccaccacg gaataaccac catcgttcag catctgcaga atctccttgc 60 454 55 DNAArtificial Sequence Synthetic DNA 454 ggttattccg tggtggatct cagcgacgatgaaatggcca agctgcatgt gcgct 55 455 54 DNA Artificial Sequence SyntheticDNA 455 ggtgactcgg acgacccccc accatatagc gcacatgcag cttggccatt tcat 54456 55 DNA Artificial Sequence Synthetic DNA 456 atatggtggg gggtcgtccgagtcacccgt tgcaggaacg cttgtacagc ttcga 55 457 55 DNA Artificial SequenceSynthetic DNA 457 aacagtgcac caggagactc cggaaactcg aagctgtaca agcgttcctgcaacg 55 458 54 DNA Artificial Sequence Synthetic DNA 458 gtttccggagtctcctggtg cactgttacg cttcctgaac accctgggga cgta 54 459 59 DNAArtificial Sequence Synthetic DNA 459 gggagcgata gtgaaacagg ctgatgttccagtacgtccc cagggtgttc aggaagcgt 59 460 59 DNA Artificial SequenceSynthetic DNA 460 ctggaacatc agcctgtttc actatcgctc ccatggtact gactacggtcgggtcctgg 59 461 54 DNA Artificial Sequence Synthetic DNA 461 gttcgtggtcccccagttca aacgcagcca ggacccgacc gtagtcagta ccat 54 462 55 DNAArtificial Sequence Synthetic DNA 462 ctgcgtttga actgggggac cacgaaccggacttcgaaac ccgcctgaac gaatt 55 463 59 DNA Artificial Sequence SyntheticDNA 463 gttgttggtt tcgtcatggc agtcgtagcc taattcgttc aggcgggttt cgaagtccg59 464 58 DNA Artificial Sequence Synthetic DNA 464 aggctacgactgccatgacg aaaccaacaa cccggcgttt cgcttttttc tggcaggg 58 465 27 DNAArtificial Sequence Synthetic DNA 465 ccctgccaga aaaaagcgaa acgccgg 27466 42 DNA Artificial Sequence Synthetic DNA 466 gcggcagcca tattaccctgccagaaaaaa gcgaaacgcc gg 42 467 15 DNA Artificial Sequence Synthetic DNA467 taatatggct gccgc 15 468 57 DNA Artificial Sequence Synthetic DNA 468ttaagcgtaa tccggaacat cgtatgggta ccctgccaga aaaaagcgaa acgccgg 57 469 30DNA Artificial Sequence Synthetic DNA 469 tacccatacg atgttccggattacgcttaa 30 470 31 DNA Artificial Sequence Synthetic DNA 470ctatatctag catatgtcat tcatggacca g 31 471 36 DNA Artificial SequenceSynthetic DNA 471 atgtcattca tggaccagat tccgggcggg ggtaac 36 472 47 DNAArtificial Sequence Synthetic DNA 472 gcaaacattc tactggcaat ttaggatagttacccccgcc cggaatc 47 473 46 DNA Artificial Sequence Synthetic DNA 473gccagtagaa tgtttgccga attttcccat tcaaccaagt ctgacc 46 474 47 DNAArtificial Sequence Synthetic DNA 474 gtttgtggct atcgtttctc ccgcgaaaggtcagacttgg ttgaatg 47 475 50 DNA Artificial Sequence Synthetic DNA 475gaaacgatag ccacaaactg aagaatttca ttagcgagat tatgctcaac 50 476 50 DNAArtificial Sequence Synthetic DNA 476 catcgttagg ccaagagatc atcgacatgttgagcataat ctcgctaatg 50 477 50 DNA Artificial Sequence Synthetic DNA477 ctcttggcct aacgatgcgt ctagaattgt gtactgccgt cgtcatttac 50 478 50 DNAArtificial Sequence Synthetic DNA 478 caaagtcatt agcccactga gcagctggattaagtaaatg acgacggcag 50 479 50 DNA Artificial Sequence Synthetic DNA479 gtgggctaat gactttgtgc aagaacaggg tattctcgag attacgttcg 50 480 48 DNAArtificial Sequence Synthetic DNA 480 gttgatacag cccctggata aatgtatcgaacgtaatctc gagaatac 48 481 48 DNA Artificial Sequence Synthetic DNA 481gtattctcga gattacgttc gatacattta tccaggggct gtatcaac 48 482 49 DNAArtificial Sequence Synthetic DNA 482 gattttattg atatcaggcg gtttataaaagtgttgatac agcccctgg 49 483 50 DNA Artificial Sequence Synthetic DNA 483ccgcctgata tcaataaaat ctttaacgcc atcacgcagc tgtccgaggc 50 484 49 DNAArtificial Sequence Synthetic DNA 484 ccgttgattc agacgttcaa tgcctaattttgcctcggac agctgcgtg 49 485 49 DNA Artificial Sequence Synthetic DNA 485gaacgtctga atcaacggtt tcggaaaatt tgggatcgca tgccaccag 49 486 46 DNAArtificial Sequence Synthetic DNA 486 cataattgcg gccttttctg tcatgaaatctggtggcatg cgatcc 46 487 48 DNA Artificial Sequence Synthetic DNA 487gaaaaggccg caattatgac gtatacccgg ttactgacga aagagacc 48 488 49 DNAArtificial Sequence Synthetic DNA 488 ctccggctta tgcatacgta caatattataggtctctttc gtcagtaac 49 489 50 DNA Artificial Sequence Synthetic DNA 489gtatgcataa gccggagacc ctgaaagatg cgatggagga agcctaccag 50 490 50 DNAArtificial Sequence Synthetic DNA 490 cagggaagaa tcgttcggta agggcagtggtctggtaggc ttcctccatc 50 491 50 DNA Artificial Sequence Synthetic DNA491 gatggaggaa gcctaccaga ccactgccct taccgaacga ttcttccctg 50 492 49 DNAArtificial Sequence Synthetic DNA 492 gatggtatct ccgtccgcgt ccagttcaaagccagggaag aatcgttcg 49 493 50 DNA Artificial Sequence Synthetic DNA 493cggacggaga taccatcata ggcgcaacca ctcacttgca ggaagagtac 50 494 50 DNAArtificial Sequence Synthetic DNA 494 caggttatct tccgaatcgt aatcagaatcgtactcttcc tgcaagtgag 50 495 49 DNA Artificial Sequence Synthetic DNA495 cgattcggaa gataacctga cccaaaatgg ctacgttcac actgttagg 49 496 49 DNAArtificial Sequence Synthetic DNA 496 gctcatgggc ttattgtatg aacgacgggtcctaacagtg tgaacgtag 49 497 50 DNA Artificial Sequence Synthetic DNA 497catacaataa gcccatgagc aaccatcgga accgcagaaa caacaacccg 50 498 50 DNAArtificial Sequence Synthetic DNA 498 gcacagacgg tttttgatgc attcttctcggctcgggttg ttgtttctgc 50 499 49 DNA Artificial Sequence Synthetic DNA499 catcaaaaac cgtctgtgct tttattgtaa gaaagaaggc catcgactg 49 500 48 DNAArtificial Sequence Synthetic DNA 500 gagcttgctt tacgggcacg gcattcattcagtcgatggc cttctttc 48 501 27 DNA Artificial Sequence Synthetic DNA 501gcccgtaaag caagctctaa ccgtagc 27 502 17 DNA Artificial SequenceSynthetic DNA 502 gctacggtta gagcttg 17 503 34 DNA Artificial SequenceSynthetic DNA 503 gtattggatc cttattagct acggttagag cttg 34 504 1641 DNAArtificial Sequence Synthetic DNA 504 ctatatctag catatgacca tcaccccggaaacctctcgc ccgatcgaca ccgaatcttg 60 gaaatcttac tacaaatccg acccgctgtgctctgctgtt ctgatccaca tgaaagaact 120 gacccagcac aacgttaccc cagaagacatgtccgctttc cgctcctatc agaaaaagct 180 ggaactgtct gagaccttcc gtaaaaactactccctggag gacgaaatga tctactacca 240 agatcgcctg gttgtaccga ttaaacaacaaaatgctgtc atgcgtctgt atcacgatca 300 cactctgttt ggtggtcact ttggcgtaaccgttaccctg gcgaaaatct ctccgatcta 360 ctattggccg aaactgcagc actctatcatccagtacatc cgtacctgcg ttcagtgcca 420 gctgatcaaa tctcaccgcc cacgtctgcatggtctcctg caaccgctcc cgatcgctga 480 aggtcgttgg ctggacatct ctatggacttcgttactggt ttgccgccga cctctaacaa 540 cctgaacatg atcctggtgg tggtggaccgcttctctaaa cgtgctcact tcatcgctac 600 ccgaaaaacc ctggacgcga ctcagctgatcgacctgctc ttccgttaca tcttctctta 660 ccatggcttc ccgcgtacca tcacctctgaccgtgacgtt cgtatgactg cggacaaata 720 ccaagaactg accaaacgtc tgggtatcaaatctaccatg tcttccgcta accacccgca 780 gactgatggt caatccgagc gtaccattcagaccctgaac cgtctcctgc gtgcgtatgc 840 gtctaccaac atccagaact ggcacgtttaccttccgcaa attgaattcg tttacaactc 900 cactccgact cgtactctgg gtaaatctccgttcgaaatc gacctgggtt acctgccaaa 960 cactccggcg atcaaatctg acgatgaagttaacgctcgt tccttcaccg ctgttgaact 1020 ggctaaacac ctgaaggcgc tgaccatccagaccaaagaa cagctggaac acgcgcagat 1080 cgaaatggaa accaacaaca accagcgtcgcaaaccactg ctgttgaata ttggtgatca 1140 tgttctggta caccgtgatg cctacttcaaaaaaggtgcg tacatgaaag ttcagcagat 1200 ctacgttggt ccattccgtg tcgttaagaaaatcaatgac aacgcgtatg aactggacct 1260 gaactcgcat aagaagaagc accgtgtgatcaacgttcag tttctgaaaa aatttgttta 1320 ccgtccggat gcgtacccga aaaacaaaccgatctcttct accgaacgca tcaaacgagc 1380 tcacgaagtt accgcgctga tcggcatcgacaccacccac aaaacctatc tgtgccacat 1440 gcaggacgtt gacccgaccc tgtccgttgaatactccgaa gctgaattct gccagatccc 1500 agagcgtacc cgtcgttcta tcctggcgaacttccgtcag ctgtacgaaa cccaagacaa 1560 ccctgaacgt gaagaagatg ttgtttcccagaacgaaatc tgccagtacg acaacacctc 1620 tccgtaataa ggatccaata c 1641 50546 DNA Artificial Sequence Synthetic DNA 505 ctatatctag catatgaccatcaccccgga aacctctcgc ccgatc 46 506 25 DNA Artificial Sequence SyntheticDNA 506 accatcaccc cggaaacctc tcgcc 25 507 50 DNA Artificial SequenceSynthetic DNA 507 accatcaccc cggaaacctc tcgcccgatc gacaccgaat cttggaaatc50 508 50 DNA Artificial Sequence Synthetic DNA 508 cgatcgacaccgaatcttgg aaatcttact acaaatccga cccgctgtgc 50 509 50 DNA ArtificialSequence Synthetic DNA 509 ttactacaaa tccgacccgc tgtgctctgc tgttctgatccacatgaaag 50 510 49 DNA Artificial Sequence Synthetic DNA 510tctgctgttc tgatccacat gaaagaactg acccagcaca acgttaccc 49 511 45 DNAArtificial Sequence Synthetic DNA 511 aactgaccca gcacaacgtt accccagaagacatgtccgc tttcc 45 512 46 DNA Artificial Sequence Synthetic DNA 512cagaagacat gtccgctttc cgctcctatc agaaaaagct ggaact 46 513 48 DNAArtificial Sequence Synthetic DNA 513 gctcctatca gaaaaagctg gaactgtctgagaccttccg taaaaact 48 514 48 DNA Artificial Sequence Synthetic DNA 514gctcctatca gaaaaagctg gaactgtctg agaccttccg taaaaact 48 515 50 DNAArtificial Sequence Synthetic DNA 515 gtctgagacc ttccgtaaaa actactccctggaggacgaa atgatctact 50 516 50 DNA Artificial Sequence Synthetic DNA516 actccctgga ggacgaaatg atctactacc aagatcgcct ggttgtaccg 50 517 49 DNAArtificial Sequence Synthetic DNA 517 accaagatcg cctggttgta ccgattaaacaacaaaatgc tgtcatgcg 49 518 50 DNA Artificial Sequence Synthetic DNA 518attaaacaac aaaatgctgt catgcgtctg tatcacgatc acactctgtt 50 519 50 DNAArtificial Sequence Synthetic DNA 519 tctgtatcac gatcacactc tgtttggtggtcactttggc gtaaccgtta 50 520 50 DNA Artificial Sequence Synthetic DNA520 tggtggtcac tttggcgtaa ccgttaccct ggcgaaaatc tctccgatct 50 521 50 DNAArtificial Sequence Synthetic DNA 521 ccctggcgaa aatctctccg atctactattggccgaaact gcagcactct 50 522 50 DNA Artificial Sequence Synthetic DNA522 ccctggcgaa aatctctccg atctactatt ggccgaaact gcagcactct 50 523 50 DNAArtificial Sequence Synthetic DNA 523 actattggcc gaaactgcag cactctatcatccagtacat ccgtacctgc 50 524 50 DNA Artificial Sequence Synthetic DNA524 atcatccagt acatccgtac ctgcgttcag tgccagctga tcaaatctca 50 525 50 DNAArtificial Sequence Synthetic DNA 525 gttcagtgcc agctgatcaa atctcaccgcccacgtctgc atggtctcct 50 526 50 DNA Artificial Sequence Synthetic DNA526 ccgcccacgt ctgcatggtc tcctgcaacc gctcccgatc gctgaaggtc 50 527 50 DNAArtificial Sequence Synthetic DNA 527 gcaaccgctc ccgatcgctg aaggtcgttggctggacatc tctatggact 50 528 47 DNA Artificial Sequence Synthetic DNA528 gttggctgga catctctatg gacttcgtta ctggtttgcc gccgacc 47 529 50 DNAArtificial Sequence Synthetic DNA 529 tcgttactgg tttgccgccg acctctaacaacctgaacat gatcctggtg 50 530 50 DNA Artificial Sequence Synthetic DNA530 tcgttactgg tttgccgccg acctctaaca acctgaacat gatcctggtg 50 531 50 DNAArtificial Sequence Synthetic DNA 531 tctaacaacc tgaacatgat cctggtggtggtggaccgct tctctaaacg 50 532 50 DNA Artificial Sequence Synthetic DNA532 gtggtggacc gcttctctaa acgtgctcac ttcatcgcta cccgaaaaac 50 533 50 DNAArtificial Sequence Synthetic DNA 533 tgctcacttc atcgctaccc gaaaaaccctggacgcgact cagctgatcg 50 534 50 DNA Artificial Sequence Synthetic DNA534 cctggacgcg actcagctga tcgacctgct cttccgttac atcttctctt 50 535 50 DNAArtificial Sequence Synthetic DNA 535 acctgctctt ccgttacatc ttctcttaccatggcttccc gcgtaccatc 50 536 50 DNA Artificial Sequence Synthetic DNA536 accatggctt cccgcgtacc atcacctctg accgtgacgt tcgtatgact 50 537 50 DNAArtificial Sequence Synthetic DNA 537 acctctgacc gtgacgttcg tatgactgcggacaaatacc aagaactgac 50 538 50 DNA Artificial Sequence Synthetic DNA538 acctctgacc gtgacgttcg tatgactgcg gacaaatacc aagaactgac 50 539 50 DNAArtificial Sequence Synthetic DNA 539 gcggacaaat accaagaact gaccaaacgtctgggtatca aatctaccat 50 540 50 DNA Artificial Sequence Synthetic DNA540 caaacgtctg ggtatcaaat ctaccatgtc ttccgctaac cacccgcaga 50 541 50 DNAArtificial Sequence Synthetic DNA 541 gtcttccgct aaccacccgc agactgatggtcaatccgag cgtaccattc 50 542 50 DNA Artificial Sequence Synthetic DNA542 ctgatggtca atccgagcgt accattcaga ccctgaaccg tctcctgcgt 50 543 50 DNAArtificial Sequence Synthetic DNA 543 agaccctgaa ccgtctcctg cgtgcgtatgcgtctaccaa catccagaac 50 544 50 DNA Artificial Sequence Synthetic DNA544 gcgtatgcgt ctaccaacat ccagaactgg cacgtttacc ttccgcaaat 50 545 50 DNAArtificial Sequence Synthetic DNA 545 tggcacgttt accttccgca aattgaattcgtttacaact ccactccgac 50 546 50 DNA Artificial Sequence Synthetic DNA546 tggcacgttt accttccgca aattgaattc gtttacaact ccactccgac 50 547 50 DNAArtificial Sequence Synthetic DNA 547 tgaattcgtt tacaactcca ctccgactcgtactctgggt aaatctccgt 50 548 50 DNA Artificial Sequence Synthetic DNA548 tcgtactctg ggtaaatctc cgttcgaaat cgacctgggt tacctgccaa 50 549 50 DNAArtificial Sequence Synthetic DNA 549 tcgaaatcga cctgggttac ctgccaaacactccggcgat caaatctgac 50 550 50 DNA Artificial Sequence Synthetic DNA550 acactccggc gatcaaatct gacgatgaag ttaacgctcg ttccttcacc 50 551 50 DNAArtificial Sequence Synthetic DNA 551 gatgaagtta acgctcgttc cttcaccgctgttgaactgg ctaaacacct 50 552 50 DNA Artificial Sequence Synthetic DNA552 gctgttgaac tggctaaaca cctgaaggcg ctgaccatcc agaccaaaga 50 553 50 DNAArtificial Sequence Synthetic DNA 553 gaaggcgctg accatccaga ccaaagaacagctggaacac gcgcagatcg 50 554 50 DNA Artificial Sequence Synthetic DNA554 gaaggcgctg accatccaga ccaaagaaca gctggaacac gcgcagatcg 50 555 50 DNAArtificial Sequence Synthetic DNA 555 acagctggaa cacgcgcaga tcgaaatggaaaccaacaac aaccagcgtc 50 556 50 DNA Artificial Sequence Synthetic DNA556 aaatggaaac caacaacaac cagcgtcgca aaccactgct gttgaatatt 50 557 50 DNAArtificial Sequence Synthetic DNA 557 gcaaaccact gctgttgaat attggtgatcatgttctggt acaccgtgat 50 558 50 DNA Artificial Sequence Synthetic DNA558 ggtgatcatg ttctggtaca ccgtgatgcc tacttcaaaa aaggtgcgta 50 559 50 DNAArtificial Sequence Synthetic DNA 559 gcctacttca aaaaaggtgc gtacatgaaagttcagcaga tctacgttgg 50 560 50 DNA Artificial Sequence Synthetic DNA560 catgaaagtt cagcagatct acgttggtcc attccgtgtc gttaagaaaa 50 561 47 DNAArtificial Sequence Synthetic DNA 561 tccattccgt gtcgttaaga aaatcaatgacaacgcgtat gaactgg 47 562 47 DNA Artificial Sequence Synthetic DNA 562tccattccgt gtcgttaaga aaatcaatga caacgcgtat gaactgg 47 563 50 DNAArtificial Sequence Synthetic DNA 563 tcaatgacaa cgcgtatgaa ctggacctgaactcgcataa gaagaagcac 50 564 50 DNA Artificial Sequence Synthetic DNA564 acctgaactc gcataagaag aagcaccgtg tgatcaacgt tcagtttctg 50 565 49 DNAArtificial Sequence Synthetic DNA 565 cgtgtgatca acgttcagtt tctgaaaaaatttgtttacc gtccggatg 49 566 50 DNA Artificial Sequence Synthetic DNA 566aaaaaatttg tttaccgtcc ggatgcgtac ccgaaaaaca aaccgatctc 50 567 50 DNAArtificial Sequence Synthetic DNA 567 cgtacccgaa aaacaaaccg atctcttctaccgaacgcat caaacgagct 50 568 50 DNA Artificial Sequence Synthetic DNA568 cgtacccgaa aaacaaaccg atctcttcta ccgaacgcat caaacgagct 50 569 50 DNAArtificial Sequence Synthetic DNA 569 ttctaccgaa cgcatcaaac gagctcacgaagttaccgcg ctgatcggca 50 570 50 DNA Artificial Sequence Synthetic DNA570 cacgaagtta ccgcgctgat cggcatcgac accacccaca aaacctatct 50 571 50 DNAArtificial Sequence Synthetic DNA 571 tcgacaccac ccacaaaacc tatctgtgccacatgcagga cgttgacccg 50 572 50 DNA Artificial Sequence Synthetic DNA572 gtgccacatg caggacgttg acccgaccct gtccgttgaa tactccgaag 50 573 50 DNAArtificial Sequence Synthetic DNA 573 accctgtccg ttgaatactc cgaagctgaattctgccaga tcccagagcg 50 574 50 DNA Artificial Sequence Synthetic DNA574 accctgtccg ttgaatactc cgaagctgaa ttctgccaga tcccagagcg 50 575 50 DNAArtificial Sequence Synthetic DNA 575 ctgaattctg ccagatccca gagcgtacccgtcgttctat cctggcgaac 50 576 50 DNA Artificial Sequence Synthetic DNA576 tacccgtcgt tctatcctgg cgaacttccg tcagctgtac gaaacccaag 50 577 47 DNAArtificial Sequence Synthetic DNA 577 ttccgtcagc tgtacgaaac ccaagacaaccctgaacgtg aagaaga 47 578 46 DNA Artificial Sequence Synthetic DNA 578acaaccctga acgtgaagaa gatgttgttt cccagaacga aatctg 46 579 46 DNAArtificial Sequence Synthetic DNA 579 tgttgtttcc cagaacgaaa tctgccagtacgacaacacc tctccg 46 580 22 DNA Artificial Sequence Synthetic DNA 580ccagtacgac aacacctctc cg 22 581 47 DNA Artificial Sequence Synthetic DNA581 gaaatctgcc agtacgacaa cacctctccg taataaggat ccaatac 47 582 46 DNAArtificial Sequence Synthetic DNA 582 ctatatctag catatgacca tcaccccggaaacctctcgc ccgatc 46 583 25 DNA Artificial Sequence Synthetic DNA 583accatcaccc cggaaacctc tcgcc 25 584 50 DNA Artificial Sequence SyntheticDNA 584 gatttccaag attcggtgtc gatcgggcga gaggtttccg gggtgatggt 50 585 50DNA Artificial Sequence Synthetic DNA 585 cgatcgacac cgaatcttggaaatcttact acaaatccga cccgctgtgc 50 586 50 DNA Artificial SequenceSynthetic DNA 586 ctttcatgtg gatcagaaca gcagagcaca gcgggtcgga tttgtagtaa50 587 49 DNA Artificial Sequence Synthetic DNA 587 tctgctgttctgatccacat gaaagaactg acccagcaca acgttaccc 49 588 45 DNA ArtificialSequence Synthetic DNA 588 ggaaagcgga catgtcttct ggggtaacgt tgtgctgggtcagtt 45 589 46 DNA Artificial Sequence Synthetic DNA 589 cagaagacatgtccgctttc cgctcctatc agaaaaagct ggaact 46 590 48 DNA ArtificialSequence Synthetic DNA 590 agtttttacg gaaggtctca gacagttcca gctttttctgataggagc 48 591 48 DNA Artificial Sequence Synthetic DNA 591 gctcctatcagaaaaagctg gaactgtctg agaccttccg taaaaact 48 592 50 DNA ArtificialSequence Synthetic DNA 592 agtagatcat ttcgtcctcc agggagtagt ttttacggaaggtctcagac 50 593 50 DNA Artificial Sequence Synthetic DNA 593actccctgga ggacgaaatg atctactacc aagatcgcct ggttgtaccg 50 594 49 DNAArtificial Sequence Synthetic DNA 594 cgcatgacag cattttgttg tttaatcggtacaaccaggc gatcttggt 49 595 50 DNA Artificial Sequence Synthetic DNA 595attaaacaac aaaatgctgt catgcgtctg tatcacgatc acactctgtt 50 596 50 DNAArtificial Sequence Synthetic DNA 596 taacggttac gccaaagtga ccaccaaacagagtgtgatc gtgatacaga 50 597 50 DNA Artificial Sequence Synthetic DNA597 tggtggtcac tttggcgtaa ccgttaccct ggcgaaaatc tctccgatct 50 598 50 DNAArtificial Sequence Synthetic DNA 598 agagtgctgc agtttcggcc aatagtagatcggagagatt ttcgccaggg 50 599 50 DNA Artificial Sequence Synthetic DNA599 ccctggcgaa aatctctccg atctactatt ggccgaaact gcagcactct 50 600 50 DNAArtificial Sequence Synthetic DNA 600 gcaggtacgg atgtactgga tgatagagtgctgcagtttc ggccaatagt 50 601 50 DNA Artificial Sequence Synthetic DNA601 atcatccagt acatccgtac ctgcgttcag tgccagctga tcaaatctca 50 602 50 DNAArtificial Sequence Synthetic DNA 602 aggagaccat gcagacgtgg gcggtgagatttgatcagct ggcactgaac 50 603 50 DNA Artificial Sequence Synthetic DNA603 ccgcccacgt ctgcatggtc tcctgcaacc gctcccgatc gctgaaggtc 50 604 50 DNAArtificial Sequence Synthetic DNA 604 agtccataga gatgtccagc caacgaccttcagcgatcgg gagcggttgc 50 605 47 DNA Artificial Sequence Synthetic DNA605 gttggctgga catctctatg gacttcgtta ctggtttgcc gccgacc 47 606 50 DNAArtificial Sequence Synthetic DNA 606 caccaggatc atgttcaggt tgttagaggtcggcggcaaa ccagtaacga 50 607 50 DNA Artificial Sequence Synthetic DNA607 tcgttactgg tttgccgccg acctctaaca acctgaacat gatcctggtg 50 608 50 DNAArtificial Sequence Synthetic DNA 608 cgtttagaga agcggtccac caccaccaggatcatgttca ggttgttaga 50 609 50 DNA Artificial Sequence Synthetic DNA609 gtggtggacc gcttctctaa acgtgctcac ttcatcgcta cccgaaaaac 50 610 50 DNAArtificial Sequence Synthetic DNA 610 cgatcagctg agtcgcgtcc agggtttttcgggtagcgat gaagtgagca 50 611 50 DNA Artificial Sequence Synthetic DNA611 cctggacgcg actcagctga tcgacctgct cttccgttac atcttctctt 50 612 50 DNAArtificial Sequence Synthetic DNA 612 gatggtacgc gggaagccat ggtaagagaagatgtaacgg aagagcaggt 50 613 50 DNA Artificial Sequence Synthetic DNA613 accatggctt cccgcgtacc atcacctctg accgtgacgt tcgtatgact 50 614 50 DNAArtificial Sequence Synthetic DNA 614 gtcagttctt ggtatttgtc cgcagtcatacgaacgtcac ggtcagaggt 50 615 50 DNA Artificial Sequence Synthetic DNA615 acctctgacc gtgacgttcg tatgactgcg gacaaatacc aagaactgac 50 616 50 DNAArtificial Sequence Synthetic DNA 616 atggtagatt tgatacccag acgtttggtcagttcttggt atttgtccgc 50 617 50 DNA Artificial Sequence Synthetic DNA617 caaacgtctg ggtatcaaat ctaccatgtc ttccgctaac cacccgcaga 50 618 50 DNAArtificial Sequence Synthetic DNA 618 gaatggtacg ctcggattga ccatcagtctgcgggtggtt agcggaagac 50 619 50 DNA Artificial Sequence Synthetic DNA619 ctgatggtca atccgagcgt accattcaga ccctgaaccg tctcctgcgt 50 620 50 DNAArtificial Sequence Synthetic DNA 620 gttctggatg ttggtagacg catacgcacgcaggagacgg ttcagggtct 50 621 50 DNA Artificial Sequence Synthetic DNA621 gcgtatgcgt ctaccaacat ccagaactgg cacgtttacc ttccgcaaat 50 622 50 DNAArtificial Sequence Synthetic DNA 622 gtcggagtgg agttgtaaac gaattcaatttgcggaaggt aaacgtgcca 50 623 50 DNA Artificial Sequence Synthetic DNA623 tggcacgttt accttccgca aattgaattc gtttacaact ccactccgac 50 624 50 DNAArtificial Sequence Synthetic DNA 624 acggagattt acccagagta cgagtcggagtggagttgta aacgaattca 50 625 50 DNA Artificial Sequence Synthetic DNA625 tcgtactctg ggtaaatctc cgttcgaaat cgacctgggt tacctgccaa 50 626 50 DNAArtificial Sequence Synthetic DNA 626 gtcagatttg atcgccggag tgtttggcaggtaacccagg tcgatttcga 50 627 50 DNA Artificial Sequence Synthetic DNA627 acactccggc gatcaaatct gacgatgaag ttaacgctcg ttccttcacc 50 628 50 DNAArtificial Sequence Synthetic DNA 628 aggtgtttag ccagttcaac agcggtgaaggaacgagcgt taacttcatc 50 629 50 DNA Artificial Sequence Synthetic DNA629 gctgttgaac tggctaaaca cctgaaggcg ctgaccatcc agaccaaaga 50 630 50 DNAArtificial Sequence Synthetic DNA 630 cgatctgcgc gtgttccagc tgttctttggtctggatggt cagcgccttc 50 631 50 DNA Artificial Sequence Synthetic DNA631 gaaggcgctg accatccaga ccaaagaaca gctggaacac gcgcagatcg 50 632 50 DNAArtificial Sequence Synthetic DNA 632 gacgctggtt gttgttggtt tccatttcgatctgcgcgtg ttccagctgt 50 633 50 DNA Artificial Sequence Synthetic DNA633 aaatggaaac caacaacaac cagcgtcgca aaccactgct gttgaatatt 50 634 50 DNAArtificial Sequence Synthetic DNA 634 atcacggtgt accagaacat gatcaccaatattcaacagc agtggtttgc 50 635 50 DNA Artificial Sequence Synthetic DNA635 ggtgatcatg ttctggtaca ccgtgatgcc tacttcaaaa aaggtgcgta 50 636 50 DNAArtificial Sequence Synthetic DNA 636 ccaacgtaga tctgctgaac tttcatgtacgcaccttttt tgaagtaggc 50 637 50 DNA Artificial Sequence Synthetic DNA637 catgaaagtt cagcagatct acgttggtcc attccgtgtc gttaagaaaa 50 638 47 DNAArtificial Sequence Synthetic DNA 638 ccagttcata cgcgttgtca ttgattttcttaacgacacg gaatgga 47 639 47 DNA Artificial Sequence Synthetic DNA 639tccattccgt gtcgttaaga aaatcaatga caacgcgtat gaactgg 47 640 50 DNAArtificial Sequence Synthetic DNA 640 gtgcttcttc ttatgcgagt tcaggtccagttcatacgcg ttgtcattga 50 641 50 DNA Artificial Sequence Synthetic DNA641 acctgaactc gcataagaag aagcaccgtg tgatcaacgt tcagtttctg 50 642 49 DNAArtificial Sequence Synthetic DNA 642 catccggacg gtaaacaaat tttttcagaaactgaacgtt gatcacacg 49 643 50 DNA Artificial Sequence Synthetic DNA 643aaaaaatttg tttaccgtcc ggatgcgtac ccgaaaaaca aaccgatctc 50 644 50 DNAArtificial Sequence Synthetic DNA 644 agctcgtttg atgcgttcgg tagaagagatcggtttgttt ttcgggtacg 50 645 50 DNA Artificial Sequence Synthetic DNA645 cgtacccgaa aaacaaaccg atctcttcta ccgaacgcat caaacgagct 50 646 50 DNAArtificial Sequence Synthetic DNA 646 tgccgatcag cgcggtaact tcgtgagctcgtttgatgcg ttcggtagaa 50 647 50 DNA Artificial Sequence Synthetic DNA647 cacgaagtta ccgcgctgat cggcatcgac accacccaca aaacctatct 50 648 50 DNAArtificial Sequence Synthetic DNA 648 cgggtcaacg tcctgcatgt ggcacagataggttttgtgg gtggtgtcga 50 649 50 DNA Artificial Sequence Synthetic DNA649 gtgccacatg caggacgttg acccgaccct gtccgttgaa tactccgaag 50 650 50 DNAArtificial Sequence Synthetic DNA 650 cgctctggga tctggcagaa ttcagcttcggagtattcaa cggacagggt 50 651 50 DNA Artificial Sequence Synthetic DNA651 accctgtccg ttgaatactc cgaagctgaa ttctgccaga tcccagagcg 50 652 50 DNAArtificial Sequence Synthetic DNA 652 gttcgccagg atagaacgac gggtacgctctgggatctgg cagaattcag 50 653 50 DNA Artificial Sequence Synthetic DNA653 tacccgtcgt tctatcctgg cgaacttccg tcagctgtac gaaacccaag 50 654 47 DNAArtificial Sequence Synthetic DNA 654 tcttcttcac gttcagggtt gtcttgggtttcgtacagct gacggaa 47 655 46 DNA Artificial Sequence Synthetic DNA 655acaaccctga acgtgaagaa gatgttgttt cccagaacga aatctg 46 656 46 DNAArtificial Sequence Synthetic DNA 656 cggagaggtg ttgtcgtact ggcagatttcgttctgggaa acaaca 46 657 22 DNA Artificial Sequence Synthetic DNA 657ccagtacgac aacacctctc cg 22 658 21 DNA Artificial Sequence Synthetic DNA658 cggagaggtg ttgtcgtact g 21 659 47 DNA Artificial Sequence SyntheticDNA 659 gtattggatc cttattacgg agaggtgttg tcgtactggc agatttc 47

What is claimed is:
 1. A method of synthesizing a DNA sequencecomprising: (i) dividing the DNA sequence recursively into small piecesof DNA, wherein adjacent pieces comprise overlapping regions; (ii)optimizing the sequences of the pieces of DNA resulting from eachrecursive division to strengthen correct hybridizations and to disruptincorrect hybridizations; (iii) obtaining the optimized small pieces ofDNA, wherein the overlapping regions of any adjacent pieces ofsingle-stranded DNA are complementary; (iv) combining the pieces of DNAderived from the division of the next-larger piece of DNA; (v) allowingthe pieces of DNA to self-assemble to form a DNA construct comprisingsingle-stranded DNA segments connected by double-stranded overlapregions; (vi) producing the next-larger piece of DNA from the DNAconstruct; and (vii) repeating steps (iv), (v), and (vi) in reverseorder of the recursive division in step (i) to produce the DNA sequence.2. The method of claim 1, wherein a next-larger piece of DNA comprises amixture of DNA molecules, the method further comprising: selecting a DNAmolecule from the mixture likely to have the correct DNA sequence, andusing the selected DNA molecule in the synthesis of the DNA sequence. 3.The method of claim 2, wherein a DNA molecule is separated from themixture by cloning.
 4. The method of claim 2, wherein the selectioncomprises sequencing a sample of DNA molecules from the mixture andselecting a DNA molecule with the desired DNA sequence.
 5. The method ofclaim 2, wherein the selection comprises expressing a polypeptide fromeach member of a sample of DNA molecules from the mixture, determiningthe molecular weight of the polypeptide, and selecting a DNA moleculefrom which a polypeptide with a predetermined molecular weight isexpressed.
 6. The method of claim 5, wherein a start codon and/or a stopcodon is incorporated into the DNA molecule from which a polypeptide isexpressed.
 7. The method of claim 6, wherein the reading frame of theDNA molecule is adjusted with respect to the start codon and/or stopcodon.
 8. The method of claim 5, wherein each member of the sample ofDNA molecules is inserted into an expression vector, and wherein theexpression vector comprises a stop codon downstream from the insertedDNA molecule.
 9. The method of claim 5, wherein the molecular weight ofthe polypeptide is determined by electrophoresis.
 10. The method ofclaim 1, wherein the DNA sequence comprises a regulatory sequence. 11.The method of claim 1, wherein the DNA sequence comprises an intergenicsequence.
 12. The method of claim 1, wherein the DNA sequence encodes apolypeptide.
 13. The method of claim 12, wherein the polypeptide is afull-length protein.
 14. The method of claim 1, wherein dividing the DNAsequence into small pieces of DNA is performed in a single division. 15.The method of claim 1, wherein dividing the DNA sequence into smallpieces of DNA is performed in a plurality of divisions.
 16. The methodof claim 15, wherein the DNA sequence is divided into pieces of DNA ofabout 1,500 bases long or shorter.
 17. The method of claim 1, whereinthe small pieces of DNA are about 60 bases long or shorter.
 18. Themethod of claim 17, wherein the small pieces of DNA are about 50 baseslong or shorter.
 19. The method of claim 1, wherein the overlappingregions comprise from about 6 to about 60 base-pairs.
 20. The method ofclaim 1, wherein optimizing comprises calculating a melting temperaturefor the pieces of DNA.
 21. The method of claim 1, wherein optimizingcomprises calculating a parameter related to hybridization propensityfor the pieces of DNA.
 22. The method of claim 21, wherein the parameteris selected from the group consisting of free energy, enthalpy, entropy,and arithmetic or algebraic combinations thereof.
 23. The method ofclaim 20, wherein the melting temperature of the lowest melting correcthybridization is at least 1° C. higher than the melting temperature ofthe highest melting incorrect hybridization.
 24. The method of claim 20,wherein the melting temperature of the lowest melting correcthybridization is at least 4° C. higher than the melting temperature ofthe highest melting incorrect hybridization.
 25. The method of claim 20,wherein the melting temperature of the lowest melting correcthybridization is at least 8° C. higher than the melting temperature ofthe highest melting incorrect hybridization.
 26. The method of claim 20,wherein the melting temperature of the lowest melting correcthybridization is at least 16° C. higher than the melting temperature ofthe highest melting incorrect hybridization.
 27. The method of claim 1,wherein optimizing comprises taking advantage of the degeneracy in theregulatory region consensus sequence.
 28. The method of claim 1, whereinoptimizing comprises direct base assignment.
 29. The method of claim 1,wherein optimizing comprises adjusting boundary points between adjacentpieces of DNA.
 30. The method of claim 1, wherein optimizing comprisespermuting silent codon substitutions.
 31. The method of claim 1, whereinat least one of the optimized small pieces of DNA is synthetic.
 32. Themethod of claim 1, wherein at least one of the optimized small pieces ofDNA is single-stranded.
 33. The method of claim 1, wherein asingle-stranded DNA segment has a length of from about zero bases toabout 20 bases.
 34. The method of claim 1, wherein the next-larger pieceof DNA is produced by cloning the DNA construct.
 35. The method of claim34, wherein the cloning is selected from the group consisting ofexonuclease III cloning, topoisomerase cloning, restriction enzymecloning, and homologous recombination cloning.
 36. The method of claim1, wherein the next-larger piece of DNA is produced by ligating the DNAconstruct.
 37. The method of claim 1, wherein the next-larger piece ofDNA is produced by extending the DNA construct by a reaction using DNApolymerase.
 38. The method of claim 37, wherein the DNA polymerase is aproof-reading DNA polymerase.
 39. The method of claim 37, furthercomprising mixing a DNA polymerase primer with the pieces of DNA derivedfrom the division of the next-larger piece of DNA.
 40. The method ofclaim 1, further comprising designing a restriction site into anoverlapping region.
 41. The method of claim 40, further comprisingdigesting the restriction site with a site-specific restriction enzyme.42. A DNA sequence synthesized according to a method comprising: (i)dividing the DNA sequence recursively into small pieces of DNA, whereinadjacent pieces comprise overlapping regions; (ii) optimizing thesequences of the pieces of DNA resulting from each recursive division tostrengthen correct hybridizations and to disrupt incorrecthybridizations; (iii) obtaining the optimized small pieces of DNA,wherein the overlapping regions of any adjacent pieces ofsingle-stranded DNA are complementary; (iv) combining the pieces of DNAderived from the division of the next larger piece of DNA; (v) allowingthe pieces of DNA to self-assemble to form a DNA construct comprisingsingle-stranded DNA segments connected by double-stranded overlapregions; (vi) producing the next-larger piece of DNA from the DNAconstruct; and (vii) repeating steps (iv), (v), and (vi) in reverseorder of the recursive division in step (i) to produce the DNA sequence.43. The DNA sequence of claim 42, wherein a next-larger piece of DNAcomprises a mixture of DNA molecules, the method further comprising:selecting a DNA molecule from the mixture likely to have the correct DNAsequence, and using the selected DNA molecule in the synthesis of theDNA sequence.
 44. The DNA sequence of claim 43, wherein a DNA moleculeis separated from the mixture by cloning.
 45. The DNA sequence of claim43, wherein the selection comprises sequencing a sample of DNA moleculesfrom the mixture and selecting a DNA molecule with the desired DNAsequence.
 46. The DNA sequence of claim 43, wherein the selectioncomprises expressing a polypeptide from each member of a sample of DNAmolecules from the mixture, determining the molecular weight of thepolypeptide, and selecting a DNA molecule from which a polypeptide witha predetermined molecular weight is expressed.
 47. The DNA sequence ofclaim 46, wherein a start codon and/or a stop codon is incorporated intothe DNA molecule from which a polypeptide is expressed.
 48. The DNAsequence of claim 47, wherein the reading frame of the DNA molecule isadjusted with respect to the start codon and/or stop codon.
 49. The DNAsequence of claim 46, wherein each member of the sample of DNA moleculesis inserted into an expression vector, and wherein the expression vectorcomprises a stop codon downstream from the inserted DNA molecule. 50.The DNA sequence of claim 46, wherein the molecular weight of thepolypeptide is determined by electrophoresis.
 51. The DNA sequence ofclaim 42, wherein the DNA sequence comprises a regulatory sequence. 52.The DNA sequence of claim 42, wherein the DNA sequence comprises anintergenic sequence.
 53. The DNA sequence of claim 42, wherein the DNAsequence encodes a polypeptide.
 54. The DNA sequence of claim 53,wherein the polypeptide is a full-length protein.
 55. The DNA sequenceof claim 42, wherein dividing the DNA sequence into small pieces of DNAis performed in a single division.
 56. The DNA sequence of claim 42,wherein dividing the DNA sequence into small pieces of DNA is performedin a plurality of divisions.
 57. The DNA sequence of claim 56, whereinthe DNA sequence is divided into pieces of DNA of about 1,500 bases longor shorter.
 58. The DNA sequence of claim 42, wherein the small piecesof DNA are about 60 bases long or shorter.
 59. The DNA sequence of claim58, wherein the small pieces of DNA are about 50 bases long or shorter.60. The DNA sequence of claim 42, wherein the overlapping regionscomprise from about 6 to about 60 base-pairs.
 61. The DNA sequence ofclaim 42, wherein optimizing comprises calculating a melting temperaturefor the pieces of DNA.
 62. The DNA sequence of claim 42, whereinoptimizing comprises calculating a parameter related to hybridizationpropensity for the pieces of DNA.
 63. The DNA sequence of claim 42,wherein the parameter selected from the group consisting of free energy,enthalpy, entropy, and arithmetic or algebraic combinations thereof. 64.The DNA sequence of claim 61, wherein the melting temperature of thelowest melting correct hybridization is at least 1° C. higher than themelting temperature of the highest melting incorrect hybridization. 65.The DNA sequence of claim 61, wherein the melting temperature of thelowest melting correct hybridization is at least 4° C. higher than themelting temperature of the highest melting incorrect hybridization. 66.The DNA sequence of claim 61, wherein the melting temperature of thelowest melting correct hybridization is at least 8° C. higher than themelting temperature of the highest melting incorrect hybridization. 67.The DNA sequence of claim 61, wherein the melting temperature of thelowest melting correct hybridization is at least 16° C. higher than themelting temperature of the highest melting incorrect hybridization. 68.The DNA sequence of claim 51, wherein optimizing comprises takingadvantage of the degeneracy in the regulatory region consensus sequence.69. The DNA sequence of claim 52, wherein optimizing comprises directbase assignment.
 70. The DNA sequence of claim 52, wherein optimizingcomprises adjusting boundary point between adjacent pieces of DNA. 71.The DNA sequence of claim 53, wherein optimizing comprises permutingsilent codon substitutions.
 72. The DNA sequence of claim 42, wherein atleast one of the optimized small pieces of DNA is synthetic.
 73. The DNAsequence of claim 42, wherein at least one of the optimized small piecesof DNA is single-stranded.
 74. The DNA sequence of claim 42, wherein asingle-stranded DNA segment has a length of from about zero bases toabout 20 bases.
 75. The DNA sequence of claim 42, wherein thenext-larger piece of DNA is produced by cloning the DNA construct. 76.The DNA sequence of claim 75, wherein the cloning is selected from thegroup consisting of exonuclease III cloning, topoisomerase cloning,restriction enzyme cloning, and homologous recombination cloning. 77.The DNA sequence of claim 42, wherein the next-larger piece of DNA isproduced by ligating the DNA construct.
 78. The DNA sequence of claim42, wherein the next-larger piece of DNA is produced by extending theDNA construct by a reaction using DNA polymerase.
 79. The DNA sequenceof claim 78, wherein the DNA polymerase is a proof-reading DNApolymerase.
 80. The DNA sequence of claim 78, further comprising mixinga DNA polymerase primer with the pieces of DNA derived from the divisionof the next-larger piece of DNA.
 81. The DNA sequence of claim 42,further comprising designing a restriction site into an overlappingregion.
 82. The DNA sequence of claim 81, further comprising digestingthe restriction site with a site-specific restriction enzyme.